Ing. Mgr. Psutka Josef V., Ph.D.
Předměty
Garant předmětů
Přednášející předmětů
- Analýza a rozpoznávání řeči (KKY/ARŘ)
- Strojové učení, řešení úloh a rozpoznávání (KKY/SUR)
- Systémy vnímání a porozumění (KKY/SVP)
- Základy strojového učení a rozpoznávání (KKY/ZSUR)
- Zpracování signálu (KKY/ZSI)
Cvičící předmětů
- Analýza a rozpoznávání řeči (KKY/ARŘ)
- Strojové učení, řešení úloh a rozpoznávání (KKY/SUR)
- Systémy vnímání a porozumění (KKY/SVP)
- Učící se systémy a klasifikátory (KKY/USK)
- Základy strojového učení a rozpoznávání (KKY/ZSUR)
- Zpracování signálu (KKY/ZSI)
Vedené studentské práce
název | typ rok |
zadavatel obor |
zadáno |
---|---|---|---|
Automatizace přípravy dat k tvorbě akustického modelu |
DP |
Psutka Josef V.
UI |
volné |
Automatizace tvorby akustických modelů |
DP BP |
Psutka Josef V.
UI |
volné |
Optimalizace parametrů akustických modelů |
BP SPC |
Psutka Josef V.
UI |
volné |
Příprava obecného akustického modelu sportovních přenosů |
DP |
Psutka Josef V.
UI |
volné |
Využití spodoby znělosti při tvorbě akustického modelu |
BP SPC |
Psutka Josef V.
UI |
volné |
Využití state-of-the-art metod akustického modelování v real-time systémech rozpoznávání řeči |
DP BP |
Psutka Josef V.
UI |
volné |
Alternativní tvorba vícesložkových akustických modelů |
BP SPC |
Psutka Josef V.
UI |
dokončeno |
Detekce hudby v řečovém signálu |
DP BP |
Psutka Josef V.
UI |
zadáno (Viktor MÄRZ) |
Optimální rozmístění pásmových filtrů v MFCC pro konkrétního řečníka |
DP |
Psutka Josef V.
UI |
dokončeno |
Publikace
+ / - Publikace v roce 2020
Diarization Based on Identification with X-Vectors. .
Speech and Computer, 22nd International Conference, SPECOM 2019, St. Petersburg, Russia, October 7-9,2020, Proceedings.,
p. 667-678,
2020.
:
+ / - Publikace v roce 2019
Diarization of The Language Consulting Center Telephone Calls .
Speech and Computer (SPECOM 2019),
p. 549-558,
Springer, Cham,
2019.
:
+ / - Publikace v roce 2018
First Insight into the Processing of the Language Consulting Center Data .
Speech and Computer 20th International Conference (SPECOM 2018),
p. 778-787,
Cham: Springer Nature Switzerland AG,
2018.
:
Towards Processing of the Oral History Interviews and Related Printed Documents .
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018),
2104,
European Language Resources Association (ELRA),
2018.
:
+ / - Publikace v roce 2014
Captioning of Live TV Commentaries from the Olympic Games in Sochi: Some Interesting Insights .
Lecture Notes in Artificial Intelligence,
vol. 8655,
p. 515-522,
Springer,
2014.
:
+ / - Publikace v roce 2013
Covariance Matrix Enhancement Approach to Train Robust Gaussian Mixture Models of Speech Data .
Speech and Computer,
Lecture Notes in Computer Science,
vol. 8113,
p. 92-99,
Springer,
2013.
:
Online Speaker Adaptation of an Acoustic Model using Face Recognition .
Text, Speech and Dialogue, Proceedings of the 16th International Conference TSD 2013,
Lecture Notes in Artificial Intelligence,
vol. 8082,
p. 378-385,
Springer Berlin Heidelberg,
2013.
:
+ / - Publikace v roce 2012
Full Covariance Gaussian Mixture Models Evaluation on GPU .
IEEE International Symposium on Signal Processing and Information Technology,
Vietnam, Ho Chi Minh City,
2012.
:
Optimized Acoustic Likelihoods Computation for NVIDIA and ATI/AMD Graphics Processors .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING,
6,
vol. 20,
p. 1818-1828,
Institute of Electrical and Electronics Engineers ( IEEE ),
2012.
:
+ / - Publikace v roce 2011
Speaker-clustered Acoustic Models Evaluated on GPU for on-line Subtitling of Parliament Meetings .
Text, Speech, and Dialogue,
Lecture Notes in Computer Science,
vol. 6836,
p. 284-290,
Springer,
2011.
:
Optimization of the Gaussian Mixture Model Evaluation on GPU .
12th Annual Conference of the International Speech Communication Association 2011 (INTERSPEECH 2011),
p. 1748-1751,
Firenze, Italy,
2011.
:
+ / - Publikace v roce 2010
Training of Speaker-Clustered Discriminative Acoustic Models for Use in Real-Time Recognizers .
Speech Processing,
vol. 2010,
p. 152-158,
Institute of Photonics and Electronics AS CR,
Prague,
2010.
:
Gender-dependent acoustic models fusion developed for automatic subtitling of Parliament meetings broadcasted by the Czech TV .
Lecture Notes in Computer Science,
vol. 2010,
p. 431-438,
Springer,
Berlin,
2010.
:
Fast Phonetic/Lexical Searching in the Archives of the Czech Holocaust Testimonies: Advancing Towards the MALACH Project Visions .
Lecture Notes in Computer Science,
vol. 2010,
p. 385-391,
Springer,
Heidelberg,
2010.
:
+ / - Publikace v roce 2009
Discriminative training of gender-dependent acoustic models .
Text, Speech and Dialogue,
p. 331-338,
Springer,
Plzeň,
2009.
:
Training of Speaker-Clustered Acoustic Models for Use in Real-Time Recognizers .
Proceedings of the International Conference on Signal Processing and Multimedia Application,
p. 131-135,
INSTICC,
Miláno,
2009.
:
Fast Speaker Adaptation in Automatic Online Subtitling .
SIGMAP,
p. 126-130,
Italy,
2009.
:
Using Morphological Information for Robust Language Modeling in Czech ASR System .
IEEE Transactions on Audio Speech and Language Processing,
vol. 17,
p. 840-847,
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC,
2009.
:
+ / - Publikace v roce 2008
Automatic Speech Recognition and Information Retrieval Techniques for Facilitating Access to Video Archives of Cultural Heritage .
IEEE SMC International Conference on Distributed Human-Machine Systems,
vol. ;,
p. 323-328,
Czech Technical University,
Atény,
2008.
:
Efficient Combination of N-gram Language Models and Recognition Grammars in Real-Time LVCSR Decoder .
9th International Conference on Signal Processing Proceedings,
vol. ;,
p. 587-591,
IEEE,
Peking, Čína,
2008.
:
Multiple Application of the MLLT Based on Clustering Supported by Phonetic Knowledge .
9th International Conference on Signal Processing Proceedings,
vol. ;,
p. 613-617,
IEEE,
Peking, Čína,
2008.
:
+ / - Publikace v roce 2007
Robust PLP-Based Parameterization for ASR Systems .
SPECOM 2007 Proceedings,
p. 509-515,
Moskow State Linguistic University,
Moscow,
2007.
:
Feature space reduction and decorrelation in a large number of speech recognition experiments .
Signal and Image Processing,
,
p. 158-161,
ACTA Press,
Anaheim,
2007.
:
LIVE TV SUBTITLING - Fast 2-pass LVCSR System for Online Subtitling .
SIGMAP 2007,
,
p. 139-142,
INSTICC PRESS,
Lisabon,
2007.
:
Searching for a robust MFCC-based parameterization for ASR application .
SIGMAP 2007,
,
p. 196-199,
INSTICC PRESS,
Lisabon,
2007.
:
Techniky parametrizace, dekorelace a redukce dimenze příznaků v systémech rozpoznávání řeči .
p. 134,
University of West Bohemia, Faculty of Applied Sciences, Pilsen, Czech Republic,
Plzeň,
2007.
:
Benefit of maximum likelihood linear transform (MLLT) used at different levels of covariance matrices clustering in ASR systems .
Lecture Notes in Artificial Intelligence,
4629,
p. 431-438,
2007.
:
Hungarian MALACH acoustic front-end .
Katedra kybernetiky, Fakulta aplikovaných věd Západočeské univerzity v Plzni,
2007.
:
+ / - Publikace v roce 2006
Automatic online subtitling of the Czech parliament meetings .
Lecture Notes in Artificial Intelligence,
Lecture notes in artificial intelligence. 0302-9743 ; 4188,
4188,
p. 501-508,
Springer,
Berlin,
2006.
:
Comparison of various feature decorrelation techniques in automatic speech recognition .
CITSA 2006,
p. 165-168,
IIIS,
Orlando,
2006.
:
Recognition of spontaneous speech - some problems and their solutions .
CITSA 2006 ,
p. 169-172,
IIIS,
Orlando,
2006.
:
Comparison between GMM and decision graphs based silence/speech detection method .
Proceedings of the 11th international conference "Speech and computer" SPECOM'2006,
p. 376-379,
Anatolya Publishers,
St. Petersburg,
2006.
:
Benefit of a class-based language model for real-time closed-captioning of TV ice-hockey commentaries .
Proceedings of LREC 2006,
p. 2064-2067,
ELRA,
Paris,
2006.
:
Modul zpracování klíčových slov CZ .
Katedra kybernetiky, Západočeská univerzita v Plzni,
2006.
:
Polish Malach Speech Corpus .
Katedra kybernetiky, Fakulta aplikovaných věd, Západočeská univerzita v Plzni, Johns Hopkins Univ. v Baltimore, Shoah Visual History Foundation,
2006.
:
Slovak Malach Speech Corpus .
Katedra kybernetiky, Fakulta aplikovaných věd, Západočeská univerzita v Plzni, Johns Hopkins Univ. v Baltimore, Shoah Visual History Foundation,
2006.
:
Slovak Spontaneaous Speech – Acoustic&Language Models (MALACH) .
Katedra kybernetiky, Fakulta aplikovaných věd, Západočeská univerzita v Plzni, Johns Hopkins University Baltimore, Shoah Visual History Foundation,
2006.
:
+ / - Publikace v roce 2005
Keyword spotting with triphone based filler model .
SPECOM 2005 proceedings,
p. 487-490,
Moscow State Linguistics University,
Moscow ,
2005.
:
LVCSR system for automatic online subtitling .
SPECOM 2005 proceedings,
p. 325-328,
Moscow State Linguistics University,
Moscow,
2005.
:
Building robust PLP-based acoustic module for ASR applications .
SPECOM 2005 proceedings,
p. 761-764,
Moscow State Linguistic University,
Moscow ,
2005.
:
Automatic transcription of Czech, Russian and Slovak spontaneous speech in the MALACH project .
Interspeech Lisboa 2005,
p. 1349-1352,
ISCA,
Bonn,
2005.
:
Automatic transcription of Czech, Russian and Slovak spontaneous speech in the MALACH project .
Eurospeech,
vol. 1,
p. 1349-1352,
ISCA,
Bonn,
2005.
:
Czech Spontaneaous Speech – Acoustic&Language Models (MALACH) .
Katedra kybernetiky, Fakulta aplikovaných věd, Západočeská univerzita v Plzni, Johns Hopkins University Baltimore, Shoah Visual History Foundation,
2005.
:
Russian Malach Speech Corpus .
Katedra kybernetiky, Fakulta aplikovaných věd, Západočeská univerzita v Plzni, Johns Hopkins Univ. v Baltimore, Shoah Visual History Foundation,
2005.
:
Russian Spontaneaous Speech – Acoustic&Language Models (MALACH) .
Katedra kybernetiky, Fakulta aplikovaných věd, Západočeská univerzita v Plzni, Johns Hopkins University v Baltimore, Shoah Visual History Foundati,
2005.
:
+ / - Publikace v roce 2004
Issues in annotation of the Czech spontaneous speech corpus in the MALACH project .
Fourth international conference on language resources and evaluation,
p. 607-610,
European Language Resources Association,
Lisbon,
2004.
:
Czech broadcast news speech .
p. 4,
Linguistic Data Consortium (LDC),
USA,
2004.
:
Czech broadcast news transcripts .
p. 4,
Linguistic Data Consortium (LDC),
USA,
2004.
:
Czech Broadcast News Corpus .
Katedra kybernetiky, fakulta aplikovaných věd, Západočeská univerzita v Plzni (práva k šíření předána Linguistic Data Consortium, University of Pe,
2004.
:
+ / - Publikace v roce 2003
Building LVCSR system for transcription of spontaneously pronounced russian testimonies in the MALACH project: initial steps and first results .
Lecture Notes in Artificial Intelligence,
Lecture Notes in Artificial Intelligence,
2807,
p. 327-332,
Springer,
Berlin,
2003.
:
Building LVCSR system for transcription of spontaneously pronounced russian testimonies in the MALACH project: initial steps and first results .
Lecture Notes in Computer Science,
Lecture Notes in Artificial Intelligence,
2607,
p. 327-332,
Springer,
Berlin,
2003.
:
Towards automatic transcription of spontaneous Czech speech in the MALACH project .
Lecture Notes in Artificial Intelligence,
Lecture Notes in Artificial Intelligence,
2807,
p. 214-219,
Springer,
Berlin ,
2003.
:
Large vocabulary ASR for spontaneous Czech in the MALACH project .
EUROSPEECH 2003 PROCEEDINGS,
p. 1821-1824,
ISCA,
Geneva,
2003.
:
Large vocabulary ASR for spontaneous Czech in the MALACH project .
Eurospeech,
vol. 1,
p. 1821-1824,
ISCA,
Geneva,
2003.
:
Automatic transcription of TV ice-hockey commentary .
Proceedings ,
p. 419-423,
International Institute of Informatics and Systemics,
Orlando,
2003.
:
Optimization of some parameters in the speech-processing module developed for the speaker independent ASR system .
Proceedings,
p. 414-418,
International Institute of Informatics and Systemics,
Orlando,
2003.
:
+ / - Publikace v roce 2002
Automatic transcription of Czech language oral history in the MALACH project: resources and initial experiments .
Lecture Notes in Artificial Intelligence,
2448,
p. 253-260,
2002.
:
+ / - Publikace v roce
System for Fast Lexical and Phonetic Spoken Term Detection in a Czech Cultural Heritage Archive .
EURASIP Journal on Audio, Speech, and Music Processing,
[submitted, in review],
Springer-Verlag, GmbH,
Heidelberg, Germany,
.
: