Computational Intelligence (CI) techniques are largely used to face complex modelling, prediction, and recognition tasks in different research fields. One of these is represented by Digital Audio, which finds application in entertainment, security, forensics and health to name but a few. Anyone can experience a large variety of services and products including Digital Audio technologies, undoubtedly characterized by a progressive increase of complexity, interactivity and intelligence.
The typical methodology adopted in these engineering solutions consists in extracting and manipulating useful information from the audio stream to pilot the execution of automatized services. Several technical areas in Digital Audio, involving different kinds of audible signals, share such an approach. In the “music” case study, music information retrieval is the major topic of interest, with many diverse sub-topics therein; for “speech”, we can immediately refer to speech/speaker recognition, but also the many diverse topics intimately related to the computational analysis of speech signals (affective computing and language processing, just to name a few); in the case of “sound”, acoustic fingerprint/signature, acoustic monitoring and sound detection/identification have lately seen a big interest in the field. Moreover, also cross-domain approaches to exploit the information contained in diverse kinds of environmental audio signals have been recently investigated. In many application contexts, this happens in conjunction with data coming from other media, like textual and visual, for which specific fusion techniques are required.
In dealing with these problems, the adoption of data-driven learning systems is often a “must”, and the recent success encountered by deep learning approaches (in Speech Recognition, for instance) further confirms this. This is not, however, immune to technological issues, due to the presence of non-stationary operating conditions and hard real-time constraints, made often harder by the big amount of data to process. In some other application contexts, the challenge is to face a scarce amount of data to be used for training, and suitable architectures and algorithms need to be designed on purpose. Last but not least, a key issue in Intelligent Audio Applications is given by the capability to learn representative features at different abstraction layers without the support of supervised actions. Again, the deep learning paradigm recently allowed reaching relevant achievements in this sense, with many open issues to investigate.
It is indeed of great interest for the scientific community to understand how and to what extent novel CI based techniques can be efficiently employed in Digital Audio, in the light of all aforementioned aspects. The mission of the Task Force is therefore to promote research related to the application of innovative CI solutions to Digital Audio problems, and to establish a fruitful networking platform for scientists and technicians working in the field. Specific activities will be organized along this direction, like technical meeting, special sessions, special issues, workshops. In this way, the proposers of the Task Force will create a solid reference for Digital Audio within the entire CI community, with the intention, at the same time, to attract those researchers sharing a similar approach to the one here proposed, and operating within different Scientific Societies.
