Voice imitator - deepfake. Speech synthesis in Polish language using convolutional neural networks

The program allows you to generate (imitate) the speech of the selected person. The speech synthesizer is based on deep convolutional neural networks. A Polish-language database of over 12,000 text-sound pairs was prepared, with a total duration of almost 20 hours. This served to train the neural network - the program learned the sound of my voice and can be used to express any issues with my voice (of course also other than those that were in the training set). The generated voice is then processed by digital filters.

 

An example of a sentence synthesis in Polish language: "To jest przykład imitacji mojego głosu przez program komputerowy. Teraz spróbuję zrobić to samo z głosem kogoś innego i zobaczymy, co z tego wyjdzie - już niedługo."