Speech enhancement exploiting the source-filter model

Bibliographische Detailangaben

Titel
Speech enhancement exploiting the source-filter model
verantwortlich
Elshamy, Samy (VerfasserIn); Fingscheidt, Tim (AkademischeR BetreuerIn); Martin, Rainer (AkademischeR BetreuerIn); Jorswieck, Eduard Axel (AkademischeR BetreuerIn); Technische Universität Braunschweig (Grad-verleihende Institution)
Hochschulschriftenvermerk
Dissertation, Technische Universität Carolo-Wilhelmina zu Braunschweig, 2020, Kumulative Dissertation
veröffentlicht
Braunschweig: , 2020
Erscheinungsjahr
2020
Erscheint auch als
Elshamy, Samy, Speech enhancement exploiting the source-filter model, Online-Ausgabe, Braunschweig, 2020, 1 Online-Ressource
Medientyp
Buch Hochschulschrift
Datenquelle
K10plus Verbundkatalog
Tags
Tag hinzufügen

Zugang

Weitere Informationen sehen Sie, wenn Sie angemeldet sind. Noch keinen Account? Jetzt registrieren.

Andere Ausgaben

LEADER 05412cam a2200541 4500
001 183-1750539209
003 DE-627
005 20221124105807.0
007 tu
008 210305s2020 gw ||||| m 00| ||eng c
035 |a (DE-627)1750539209 
035 |a (DE-599)KXP1750539209 
035 |a (OCoLC)1240489942 
040 |a DE-627  |b ger  |c DE-627  |e rda 
041 |a eng 
044 |c XA-DE 
084 |a 54.75  |2 bkl 
084 |a 53.73  |2 bkl 
100 1 |a Elshamy, Samy  |e VerfasserIn  |4 aut 
245 1 0 |a Speech enhancement exploiting the source-filter model  |c von Samy Elshamy 
264 1 |a Braunschweig  |c 2020 
300 |a 1 Band (verschiedene Seitenzählungen)  |b Illustrationen, Diagramme 
336 |a Text  |b txt  |2 rdacontent 
337 |a ohne Hilfsmittel zu benutzen  |b n  |2 rdamedia 
338 |a Band  |b nc  |2 rdacarrier 
500 |a Es sind 7 Zeitschriftenartikel enthalten 
502 |b Dissertation  |c Technische Universität Carolo-Wilhelmina zu Braunschweig  |d 2020  |g Kumulative Dissertation 
520 |a Imagining everyday life without mobile telephony is nowadays hardly possible. Calls are being made in every thinkable situation and environment. Hence, the microphone will not only pick up the user’s speech but also sound from the surroundings which is likely to impede the understanding of the conversational partner. Modern speech enhancement systems are able to mitigate such effects and most users are not even aware of their existence. In this thesis the development of a modern single-channel speech enhancement approach is presented, which uses the divide and conquer principle to combat environmental noise in microphone signals. Though initially motivated by mobile telephony applications, this approach can be applied whenever speech is to be retrieved from a corrupted signal. The approach uses the so-called source-filter model to divide the problem into two subproblems which are then subsequently conquered by enhancing the source (the excitation signal) and the filter (the spectral envelope) separately. Both enhanced signals are then used to denoise the corrupted signal. The estimation of spectral envelopes has quite some history and some approaches already exist for speech enhancement. However, they typically neglect the excitation signal which leads to the inability of enhancing the fine structure properly. Both individual enhancement approaches exploit benefits of the cepstral domain which offers, e.g., advantageous mathematical properties and straightforward synthesis of excitation-like signals. We investigate traditional model-based schemes like Gaussian mixture models (GMMs), classical signal processing-based, as well as modern deep neural network (DNN)-based approaches in this thesis. The enhanced signals are not used directly to enhance the corrupted signal (e.g., to synthesize a clean speech signal) but as so-called a priori signal-to-noise ratio (SNR) estimate in a traditional statistical speech enhancement system. Such a traditional system consists of a noise power estimator, an a priori SNR estimator, and a spectral weighting rule that is usually driven by the results of the aforementioned estimators and subsequently employed to retrieve the clean speech estimate from the noisy observation. As a result the new approach obtains significantly higher noise attenuation compared to current state-of-the-art systems while maintaining a quite comparable speech component quality and speech intelligibility. In consequence, the overall quality of the enhanced speech signal turns out to be superior as compared to state-of-the-art speech ehnahcement approaches. 
546 |a Zusammenfassung in deutscher und englischer Sprache 
583 1 |a Archivierung/Langzeitarchivierung gewährleistet  |f DISS  |2 pdager  |5 DE-84 
655 7 |a Hochschulschrift  |0 (DE-588)4113937-9  |0 (DE-627)105825778  |0 (DE-576)209480580  |2 gnd-content 
689 0 0 |D s  |0 (DE-588)4116579-2  |0 (DE-627)105805696  |0 (DE-576)20950272X  |a Sprachverarbeitung  |2 gnd 
689 0 1 |D s  |0 (DE-588)4177102-3  |0 (DE-627)105351121  |0 (DE-576)209975199  |a Rauschunterdrückung  |2 gnd 
689 0 |5 (DE-627) 
700 1 |a Fingscheidt, Tim  |d 1966-  |e AkademischeR BetreuerIn  |0 (DE-588)120777134  |0 (DE-627)704981327  |0 (DE-576)181384639  |4 dgs 
700 1 |a Martin, Rainer  |e AkademischeR BetreuerIn  |0 (DE-588)143794876  |0 (DE-627)704604914  |0 (DE-576)338939474  |4 dgs 
700 1 |a Jorswieck, Eduard Axel  |d 1975-  |e AkademischeR BetreuerIn  |0 (DE-588)129524913  |0 (DE-627)471010529  |0 (DE-576)297705016  |4 dgs 
710 2 |a Technische Universität Braunschweig  |e Grad-verleihende Institution  |0 (DE-588)36227-X  |0 (DE-627)100834183  |0 (DE-576)19034511X  |4 dgg 
751 |a Braunschweig  |0 (DE-588)4008065-1  |0 (DE-627)106370030  |0 (DE-576)208874119  |4 uvp 
776 0 8 |i Erscheint auch als  |n Online-Ausgabe  |a Elshamy, Samy  |t Speech enhancement exploiting the source-filter model  |b Online-Ausgabe  |d Braunschweig, 2020  |h 1 Online-Ressource  |w (DE-627)1748448455 
856 4 2 |u https://www.gbv.de/dms/tib-ub-hannover/1750539209.pdf  |m V:DE-601  |m B:DE-89  |q pdf/application  |3 Inhaltsverzeichnis 
924 0 |a 3882213418  |b DE-84  |9 84  |c GBV  |d c  |g 3513-5186  |h MAG 
924 0 |a 3882213426  |b DE-84  |9 84  |c GBV  |d b  |g 3513-5199  |h MAG 
924 0 |a 3952297178  |b DE-89  |9 89  |c GBV  |d c  |g H 21 B 690 
936 b k |a 54.75  |j Sprachverarbeitung  |x Informatik  |0 (DE-627)10640587X 
936 b k |a 53.73  |j Nachrichtenübertragung  |0 (DE-627)106417657 
951 |a BO 
980 |a 1750539209  |b 183  |c sid-183-col-kxpbbi 
openURL url_ver=Z39.88-2004&ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fkatalog.fid-bbi.de%3Agenerator&rft.title=Speech+enhancement+exploiting+the+source-filter+model&rft.date=2020&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=book&rft.btitle=Speech+enhancement+exploiting+the+source-filter+model&rft.au=Elshamy%2C+Samy&rft.pub=&rft.edition=&rft.isbn=
SOLR
_version_ 1797788680365015040
author Elshamy, Samy
author2 Fingscheidt, Tim, Martin, Rainer, Jorswieck, Eduard Axel
author2_role dgs, dgs, dgs
author2_variant t f tf, r m rm, e a j ea eaj
author_corporate Technische Universität Braunschweig
author_corporate_role dgg
author_facet Elshamy, Samy, Fingscheidt, Tim, Martin, Rainer, Jorswieck, Eduard Axel, Technische Universität Braunschweig
author_role aut
author_sort Elshamy, Samy
author_variant s e se
building Library A
collection sid-183-col-kxpbbi
contents Imagining everyday life without mobile telephony is nowadays hardly possible. Calls are being made in every thinkable situation and environment. Hence, the microphone will not only pick up the user’s speech but also sound from the surroundings which is likely to impede the understanding of the conversational partner. Modern speech enhancement systems are able to mitigate such effects and most users are not even aware of their existence. In this thesis the development of a modern single-channel speech enhancement approach is presented, which uses the divide and conquer principle to combat environmental noise in microphone signals. Though initially motivated by mobile telephony applications, this approach can be applied whenever speech is to be retrieved from a corrupted signal. The approach uses the so-called source-filter model to divide the problem into two subproblems which are then subsequently conquered by enhancing the source (the excitation signal) and the filter (the spectral envelope) separately. Both enhanced signals are then used to denoise the corrupted signal. The estimation of spectral envelopes has quite some history and some approaches already exist for speech enhancement. However, they typically neglect the excitation signal which leads to the inability of enhancing the fine structure properly. Both individual enhancement approaches exploit benefits of the cepstral domain which offers, e.g., advantageous mathematical properties and straightforward synthesis of excitation-like signals. We investigate traditional model-based schemes like Gaussian mixture models (GMMs), classical signal processing-based, as well as modern deep neural network (DNN)-based approaches in this thesis. The enhanced signals are not used directly to enhance the corrupted signal (e.g., to synthesize a clean speech signal) but as so-called a priori signal-to-noise ratio (SNR) estimate in a traditional statistical speech enhancement system. Such a traditional system consists of a noise power estimator, an a priori SNR estimator, and a spectral weighting rule that is usually driven by the results of the aforementioned estimators and subsequently employed to retrieve the clean speech estimate from the noisy observation. As a result the new approach obtains significantly higher noise attenuation compared to current state-of-the-art systems while maintaining a quite comparable speech component quality and speech intelligibility. In consequence, the overall quality of the enhanced speech signal turns out to be superior as compared to state-of-the-art speech ehnahcement approaches.
ctrlnum (DE-627)1750539209, (DE-599)KXP1750539209, (OCoLC)1240489942
facet_avail Local
facet_local_del330 Sprachverarbeitung, Rauschunterdrückung
fincclass_txtF_mv science-computerscience, engineering-electrical
footnote Es sind 7 Zeitschriftenartikel enthalten
format Book, Thesis
format_access_txtF_mv Thesis
format_de14 Book, E-Book
format_de15 Book, E-Book
format_del152 Buch
format_detail_txtF_mv text-print-monograph-independent-thesis
format_dezi4 e-Book
format_finc Book, E-Book, Thesis
format_legacy Book
format_legacy_nrw Book, E-Book
format_nrw Book, E-Book
format_strict_txtF_mv Thesis
genre Hochschulschrift (DE-588)4113937-9 (DE-627)105825778 (DE-576)209480580 gnd-content
genre_facet Hochschulschrift
geogr_code not assigned
geogr_code_person not assigned
id 183-1750539209
illustrated Not Illustrated
imprint Braunschweig, 2020
imprint_str_mv Braunschweig, 2020
institution FID-BBI-DE-23
is_hierarchy_id
is_hierarchy_title
language English
last_indexed 2024-04-30T19:21:28.858Z
marc_error [geogr_code]Unable to make public java.lang.AbstractStringBuilder java.lang.AbstractStringBuilder.append(java.lang.String) accessible: module java.base does not "opens java.lang" to unnamed module @64e01542
match_str elshamy2020speechenhancementexploitingthesourcefiltermodel
mega_collection K10plus Verbundkatalog
oclc_num 1240489942
physical 1 Band (verschiedene Seitenzählungen); Illustrationen, Diagramme
publishDate 2020
publishDateSort 2020
publishPlace Braunschweig
publisher
record_format marcfinc
record_id 1750539209
recordtype marcfinc
rvk_facet No subject assigned
source_id 183
spelling Elshamy, Samy VerfasserIn aut, Speech enhancement exploiting the source-filter model von Samy Elshamy, Braunschweig 2020, 1 Band (verschiedene Seitenzählungen) Illustrationen, Diagramme, Text txt rdacontent, ohne Hilfsmittel zu benutzen n rdamedia, Band nc rdacarrier, Es sind 7 Zeitschriftenartikel enthalten, Dissertation Technische Universität Carolo-Wilhelmina zu Braunschweig 2020 Kumulative Dissertation, Imagining everyday life without mobile telephony is nowadays hardly possible. Calls are being made in every thinkable situation and environment. Hence, the microphone will not only pick up the user’s speech but also sound from the surroundings which is likely to impede the understanding of the conversational partner. Modern speech enhancement systems are able to mitigate such effects and most users are not even aware of their existence. In this thesis the development of a modern single-channel speech enhancement approach is presented, which uses the divide and conquer principle to combat environmental noise in microphone signals. Though initially motivated by mobile telephony applications, this approach can be applied whenever speech is to be retrieved from a corrupted signal. The approach uses the so-called source-filter model to divide the problem into two subproblems which are then subsequently conquered by enhancing the source (the excitation signal) and the filter (the spectral envelope) separately. Both enhanced signals are then used to denoise the corrupted signal. The estimation of spectral envelopes has quite some history and some approaches already exist for speech enhancement. However, they typically neglect the excitation signal which leads to the inability of enhancing the fine structure properly. Both individual enhancement approaches exploit benefits of the cepstral domain which offers, e.g., advantageous mathematical properties and straightforward synthesis of excitation-like signals. We investigate traditional model-based schemes like Gaussian mixture models (GMMs), classical signal processing-based, as well as modern deep neural network (DNN)-based approaches in this thesis. The enhanced signals are not used directly to enhance the corrupted signal (e.g., to synthesize a clean speech signal) but as so-called a priori signal-to-noise ratio (SNR) estimate in a traditional statistical speech enhancement system. Such a traditional system consists of a noise power estimator, an a priori SNR estimator, and a spectral weighting rule that is usually driven by the results of the aforementioned estimators and subsequently employed to retrieve the clean speech estimate from the noisy observation. As a result the new approach obtains significantly higher noise attenuation compared to current state-of-the-art systems while maintaining a quite comparable speech component quality and speech intelligibility. In consequence, the overall quality of the enhanced speech signal turns out to be superior as compared to state-of-the-art speech ehnahcement approaches., Zusammenfassung in deutscher und englischer Sprache, Archivierung/Langzeitarchivierung gewährleistet DISS pdager DE-84, Hochschulschrift (DE-588)4113937-9 (DE-627)105825778 (DE-576)209480580 gnd-content, s (DE-588)4116579-2 (DE-627)105805696 (DE-576)20950272X Sprachverarbeitung gnd, s (DE-588)4177102-3 (DE-627)105351121 (DE-576)209975199 Rauschunterdrückung gnd, (DE-627), Fingscheidt, Tim 1966- AkademischeR BetreuerIn (DE-588)120777134 (DE-627)704981327 (DE-576)181384639 dgs, Martin, Rainer AkademischeR BetreuerIn (DE-588)143794876 (DE-627)704604914 (DE-576)338939474 dgs, Jorswieck, Eduard Axel 1975- AkademischeR BetreuerIn (DE-588)129524913 (DE-627)471010529 (DE-576)297705016 dgs, Technische Universität Braunschweig Grad-verleihende Institution (DE-588)36227-X (DE-627)100834183 (DE-576)19034511X dgg, Braunschweig (DE-588)4008065-1 (DE-627)106370030 (DE-576)208874119 uvp, Erscheint auch als Online-Ausgabe Elshamy, Samy Speech enhancement exploiting the source-filter model Online-Ausgabe Braunschweig, 2020 1 Online-Ressource (DE-627)1748448455, https://www.gbv.de/dms/tib-ub-hannover/1750539209.pdf V:DE-601 B:DE-89 pdf/application Inhaltsverzeichnis
spellingShingle Elshamy, Samy, Speech enhancement exploiting the source-filter model, Imagining everyday life without mobile telephony is nowadays hardly possible. Calls are being made in every thinkable situation and environment. Hence, the microphone will not only pick up the user’s speech but also sound from the surroundings which is likely to impede the understanding of the conversational partner. Modern speech enhancement systems are able to mitigate such effects and most users are not even aware of their existence. In this thesis the development of a modern single-channel speech enhancement approach is presented, which uses the divide and conquer principle to combat environmental noise in microphone signals. Though initially motivated by mobile telephony applications, this approach can be applied whenever speech is to be retrieved from a corrupted signal. The approach uses the so-called source-filter model to divide the problem into two subproblems which are then subsequently conquered by enhancing the source (the excitation signal) and the filter (the spectral envelope) separately. Both enhanced signals are then used to denoise the corrupted signal. The estimation of spectral envelopes has quite some history and some approaches already exist for speech enhancement. However, they typically neglect the excitation signal which leads to the inability of enhancing the fine structure properly. Both individual enhancement approaches exploit benefits of the cepstral domain which offers, e.g., advantageous mathematical properties and straightforward synthesis of excitation-like signals. We investigate traditional model-based schemes like Gaussian mixture models (GMMs), classical signal processing-based, as well as modern deep neural network (DNN)-based approaches in this thesis. The enhanced signals are not used directly to enhance the corrupted signal (e.g., to synthesize a clean speech signal) but as so-called a priori signal-to-noise ratio (SNR) estimate in a traditional statistical speech enhancement system. Such a traditional system consists of a noise power estimator, an a priori SNR estimator, and a spectral weighting rule that is usually driven by the results of the aforementioned estimators and subsequently employed to retrieve the clean speech estimate from the noisy observation. As a result the new approach obtains significantly higher noise attenuation compared to current state-of-the-art systems while maintaining a quite comparable speech component quality and speech intelligibility. In consequence, the overall quality of the enhanced speech signal turns out to be superior as compared to state-of-the-art speech ehnahcement approaches., Hochschulschrift, Sprachverarbeitung, Rauschunterdrückung
title Speech enhancement exploiting the source-filter model
title_auth Speech enhancement exploiting the source-filter model
title_full Speech enhancement exploiting the source-filter model von Samy Elshamy
title_fullStr Speech enhancement exploiting the source-filter model von Samy Elshamy
title_full_unstemmed Speech enhancement exploiting the source-filter model von Samy Elshamy
title_short Speech enhancement exploiting the source-filter model
title_sort speech enhancement exploiting the source-filter model
title_unstemmed Speech enhancement exploiting the source-filter model
topic Hochschulschrift, Sprachverarbeitung, Rauschunterdrückung
topic_facet Hochschulschrift, Sprachverarbeitung, Rauschunterdrückung
url https://www.gbv.de/dms/tib-ub-hannover/1750539209.pdf
work_keys_str_mv AT elshamysamy speechenhancementexploitingthesourcefiltermodel, AT fingscheidttim speechenhancementexploitingthesourcefiltermodel, AT martinrainer speechenhancementexploitingthesourcefiltermodel, AT jorswieckeduardaxel speechenhancementexploitingthesourcefiltermodel, AT technischeuniversitatbraunschweig speechenhancementexploitingthesourcefiltermodel