Genre Identification

Start­ing off as a mini-project for a class, it turned out to be a project to help me get learn and explore Weka, a data min­ing frame­work and using MEX func­tions in Matlab.

I used the GZTAN genre dataset as my source audio, extract­ing mul­ti­ple low level fea­tures like Spec­tral Cen­troid, Spec­tral Decrease and Mel Fre­quency Cep­stral Coef­fe­cients and deriv­ing var­i­ous sub-features from the ini­tial set.

After export­ing the fea­tures from Mat­lab into Weka, I stan­dard­ized the inputs, and tested dif­fer­ent clas­si­fiers. I had about 76% cor­rectly clas­si­fied instances with the lib­SVM wrap­per, and about 58.6% the C4.5 algo­rithm (J48 on Weka). A Ran­dom For­est with 10 trees gave me a rate of 65%, while for­est with 20 trees gave me about 72%.

I’m using Weka for my Master’s project, iden­ti­fy­ing bird species through audio. Will post updates soon!

