The task of virus classification into subtypes is an important concern in many categorization studies, e.g., in virology or epidemiology.Therefore, the problem of virus subtyping has been a subject of considerable interest in the last decade.
Although there exist several virus subtyping tools, they are often dedicated to a specific family of viruses.Even specialized methods, however, often fail to correctly subtype viruses, such as HIV or influenza.To address these shortcomings, we present a viral genome deep Bowls classifier (VGDC)-a tool for an automatic virus subtyping, which employs a deep convolutional neural network (CNN).The method is universal and can be applied for subtyping any virus, as confirmed by experiments on dengue, hepatitis B and C, HIV-1, and influenza A datasets.
For all considered virus types, the obtained classification rates are very high with the Blu-ray corresponding values of the F1-score ranging from about 0.85 to 1.00 depending on the virus type and the considered number of subtypes.For HIV-1 and influenza A, the VGDC significantly outperforms the leading competitors, including CASTOR and COMET.
The VGDC source code is freely available to facilitate easy usage and comparison with future approaches.