-
Notifications
You must be signed in to change notification settings - Fork 0
Speech.jl? #3
Comments
MFCCs is now in METADATA, and apart from default parameter settings that should mimic HTK defaults (we could use some testing there) it has some other parameter sets (:rasta should mimic the default parameters of the rastamat package). It further has various forms of feature normalization (mean/variance: znorm() and short time Gaussianization: warp()), derivatives (delta()) and shifted-delta-cepstra (sdc's, used in language recognition). We could use some additional code to compute PLP (perceptual linear prediction) coefficients, RASTA processing. Other might be interested in LPC estimation, pitch extraction, etc---for recognition this is not too useful, but for (re)synthesis it may be. I have higher level code in https://github.com/davidavdav/Feacalc.jl.git which can read .wav files and has some trivial energy-based speech activity detection, and save/load routines in HDF5 (for compatibility with non-julia software). |
That's great! So since MFCC is in METADATA, maybe we should aim for a higher degree of specialization in speech-related packages rather than having a single mega-package. |
I suppose it would be better if |
I think moving it into JuliaDSP seems pretty reasonable, MFCCs are pretty important (even outside of speech, e.g. music processing), so it's nice to have that stuff in an org rather than a personal repo. |
If other folks are onboard I think the process is roughly:
incidentally JuliaAudio could also be a reasonable org for this to live in, though that family of packages is somewhat more opinionated w.r.t. samplerate-aware buffer and stream types so the package might need some minor refactoring to interoperate nicely with them. |
Could you add me to JuliaDSP then? I tried to transfer ownership, but that didn't work as I was not allowed. |
I'm actually not a JuliaDSP member either, so I can't add you |
All right, MFCC.jl is now part of JuliaDSP. |
I would like to start drafting a new package for speech signal processing, focused mainly on speech feature extraction (MFCCs, LPCs, fundamental frequency, etc). @davidavdav has a lot of work on MFCCs at MFCC.jl but, as last time we talked, it needed some updates. @davidavdav, would you mind chiming in with your comments and suggestions? Thanks!
The text was updated successfully, but these errors were encountered: