Speech enhancement exploiting the source-filter model

Titel: Speech enhancement exploiting the source-filter model
verantwortlich: Elshamy, Samy (VerfasserIn); Fingscheidt, Tim (AkademischeR BetreuerIn); Martin, Rainer (AkademischeR BetreuerIn); Jorswieck, Eduard Axel (AkademischeR BetreuerIn); Technische Universität Braunschweig (Grad-verleihende Institution)
Hochschulschriftenvermerk: Dissertation, Technische Universität Carolo-Wilhelmina zu Braunschweig, 2020, Kumulative Dissertation
Ausgabe: Online-Ausgabe
veröffentlicht: Braunschweig: , 2020
Braunschweig: Universitätsbibliothek, 2021
Erscheinungsjahr: 2020
Erscheint auch als: Elshamy, Samy, Speech enhancement exploiting the source-filter model, Braunschweig, 2020, 1 Band (verschiedene Seitenzählungen)
Medientyp: E-Book Hochschulschrift
Datenquelle: K10plus Verbundkatalog
Tags: Tag hinzufügen

Zugang

Kostenfrei zugänglich

Diese Ressource ist frei verfügbar.

Weblinks


LEADER	07184cam a2200769 4500
001	183-1748448455
003	DE-627
005	20230110225419.0
007	cr uuu---uuuuu
008	210216s2020 gw \|\|\|\|\|om 00\| \|\|eng c
016	7		\|a 1231992123 \|2 DE-101
024	7		\|a urn:nbn:de:gbv:084-2021021215141 \|2 urn
024	7		\|a 10.24355/dbbs.084-202102121510-0 \|2 doi
035			\|a (DE-627)1748448455
035			\|a (DE-599)KXP1748448455
035			\|a (OCoLC)1237644181
040			\|a DE-627 \|b ger \|c DE-627 \|e rda
041			\|a eng
044			\|c XA-DE
082	0		\|a 621.3828 \|q DE-101
082	0	4	\|a 621.3 \|q DE-101
084			\|a 54.75 \|2 bkl
084			\|a 53.73 \|2 bkl
100	1		\|a Elshamy, Samy \|e VerfasserIn \|4 aut
245	1	0	\|a Speech enhancement exploiting the source-filter model \|c von Samy Elshamy
250			\|a Online-Ausgabe
264		1	\|a Braunschweig \|c 2020
264		2	\|a Braunschweig \|b Universitätsbibliothek \|c 2021
300			\|a 1 Online-Ressource \|b Illustrationen, Diagramme
336			\|a Text \|b txt \|2 rdacontent
337			\|a Computermedien \|b c \|2 rdamedia
338			\|a Online-Ressource \|b cr \|2 rdacarrier
502			\|b Dissertation \|c Technische Universität Carolo-Wilhelmina zu Braunschweig \|d 2020 \|g Kumulative Dissertation
520			\|a Imagining everyday life without mobile telephony is nowadays hardly possible. Calls are being made in every thinkable situation and environment. Hence, the microphone will not only pick up the user’s speech but also sound from the surroundings which is likely to impede the understanding of the conversational partner. Modern speech enhancement systems are able to mitigate such effects and most users are not even aware of their existence. In this thesis the development of a modern single-channel speech enhancement approach is presented, which uses the divide and conquer principle to combat environmental noise in microphone signals. Though initially motivated by mobile telephony applications, this approach can be applied whenever speech is to be retrieved from a corrupted signal. The approach uses the so-called source-filter model to divide the problem into two subproblems which are then subsequently conquered by enhancing the source (the excitation signal) and the filter (the spectral envelope) separately. Both enhanced signals are then used to denoise the corrupted signal. The estimation of spectral envelopes has quite some history and some approaches already exist for speech enhancement. However, they typically neglect the excitation signal which leads to the inability of enhancing the fine structure properly. Both individual enhancement approaches exploit benefits of the cepstral domain which offers, e.g., advantageous mathematical properties and straightforward synthesis of excitation-like signals. We investigate traditional model-based schemes like Gaussian mixture models (GMMs), classical signal processing-based, as well as modern deep neural network (DNN)-based approaches in this thesis. The enhanced signals are not used directly to enhance the corrupted signal (e.g., to synthesize a clean speech signal) but as so-called a priori signal-to-noise ratio (SNR) estimate in a traditional statistical speech enhancement system. Such a traditional system consists of a noise power estimator, an a priori SNR estimator, and a spectral weighting rule that is usually driven by the results of the aforementioned estimators and subsequently employed to retrieve the clean speech estimate from the noisy observation. As a result the new approach obtains significantly higher noise attenuation compared to current state-of-the-art systems while maintaining a quite comparable speech component quality and speech intelligibility. In consequence, the overall quality of the enhanced speech signal turns out to be superior as compared to state-of-the-art speech ehnahcement approaches.
655		7	\|a Hochschulschrift \|0 (DE-588)4113937-9 \|0 (DE-627)105825778 \|0 (DE-576)209480580 \|2 gnd-content
689	0	0	\|D s \|0 (DE-588)4116579-2 \|0 (DE-627)105805696 \|0 (DE-576)20950272X \|a Sprachverarbeitung \|2 gnd
689	0	1	\|D s \|0 (DE-588)4177102-3 \|0 (DE-627)105351121 \|0 (DE-576)209975199 \|a Rauschunterdrückung \|2 gnd
689	0		\|5 (DE-627)
700	1		\|a Fingscheidt, Tim \|d 1966- \|e AkademischeR BetreuerIn \|0 (DE-588)120777134 \|0 (DE-627)704981327 \|0 (DE-576)181384639 \|4 dgs
700	1		\|a Martin, Rainer \|e AkademischeR BetreuerIn \|0 (DE-588)143794876 \|0 (DE-627)704604914 \|0 (DE-576)338939474 \|4 dgs
700	1		\|a Jorswieck, Eduard Axel \|d 1975- \|e AkademischeR BetreuerIn \|0 (DE-588)129524913 \|0 (DE-627)471010529 \|0 (DE-576)297705016 \|4 dgs
710	2		\|a Technische Universität Braunschweig \|e Grad-verleihende Institution \|0 (DE-588)36227-X \|0 (DE-627)100834183 \|0 (DE-576)19034511X \|4 dgg
751			\|a Braunschweig \|0 (DE-588)4008065-1 \|0 (DE-627)106370030 \|0 (DE-576)208874119 \|4 uvp
776	0	8	\|i Erscheint auch als \|n Druck-Ausgabe \|a Elshamy, Samy \|t Speech enhancement exploiting the source-filter model \|d Braunschweig, 2020 \|h 1 Band (verschiedene Seitenzählungen) \|w (DE-627)1750539209
856	4	0	\|u https://doi.org/10.24355/dbbs.084-202102121510-0 \|v 2022-02-07 \|x Resolving-System \|z kostenfrei
856	4	0	\|u https://doi.org/10.24355/dbbs.084-202102121510-0 \|v 2021-05-05 \|x Resolving-System \|z kostenfrei
856	4	0	\|u https://nbn-resolving.org/urn:nbn:de:gbv:084-2021021215141 \|v 2021-05-05 \|x Resolving-System \|z kostenfrei
856	4	0	\|u https://d-nb.info/1231992123/34 \|v 2021-05-05 \|x Langzeitarchivierung Nationalbibliothek
856	4	0	\|u https://publikationsserver.tu-braunschweig.de/receive/dbbs_mods_00069305 \|q application/pdf \|v 2021-05-05 \|x Verlag \|z kostenfrei \|z kostenfrei
912			\|a GBV-ODiss
924	1		\|a 3855517843 \|b DE-84 \|9 84 \|c GBV \|d d \|k https://doi.org/10.24355/dbbs.084-202102121510-0
924	1		\|a 3932754727 \|b DE-46 \|9 46 \|c GBV \|d d \|k https://nbn-resolving.de/urn:nbn:de:gbv:084-2021021215141
924	1		\|a 3932827430 \|b DE-18 \|9 18 \|c GBV \|d d \|k https://nbn-resolving.de/urn:nbn:de:gbv:084-2021021215141
924	1		\|a 3932885902 \|b DE-830 \|9 830 \|c GBV \|d d \|k https://nbn-resolving.de/urn:nbn:de:gbv:084-2021021215141
924	1		\|a 3932960955 \|b DE-7 \|9 7 \|c GBV \|d d \|k https://nbn-resolving.de/urn:nbn:de:gbv:084-2021021215141
924	1		\|a 3933020867 \|b DE-705 \|9 705 \|c GBV \|d d \|k https://nbn-resolving.de/urn:nbn:de:gbv:084-2021021215141
924	1		\|a 3933078253 \|b DE-Wim2 \|9 Wim 2 \|c GBV \|d d \|k https://nbn-resolving.de/urn:nbn:de:gbv:084-2021021215141
924	1		\|a 4168925641 \|b DE-3 \|9 3 \|c GBV \|d d \|k https://nbn-resolving.org/urn:nbn:de:gbv:084-2021021215141
924	1		\|a 3933435188 \|b DE-841 \|9 841 \|c GBV \|d d \|k https://nbn-resolving.de/urn:nbn:de:gbv:084-2021021215141
924	1		\|a 3933149738 \|b DE-Luen4 \|9 Lün 4 \|c GBV \|d d \|k https://nbn-resolving.de/urn:nbn:de:gbv:084-2021021215141
924	1		\|a 3933191963 \|b DE-959 \|9 959 \|c GBV \|d d \|k https://nbn-resolving.de/urn:nbn:de:gbv:084-2021021215141
924	1		\|a 393327267X \|b DE-960 \|9 960 \|c GBV \|d d \|k https://nbn-resolving.de/urn:nbn:de:gbv:084-2021021215141
924	1		\|a 3933389259 \|b DE-960-3 \|9 960/3 \|c GBV \|d d \|k https://nbn-resolving.de/urn:nbn:de:gbv:084-2021021215141
936	b	k	\|a 54.75 \|j Sprachverarbeitung \|x Informatik \|q SEPA \|0 (DE-627)10640587X
936	b	k	\|a 53.73 \|j Nachrichtenübertragung \|q SEPA \|0 (DE-627)106417657
951			\|a BO
980			\|a 1748448455 \|b 183 \|c sid-183-col-kxpbbi

openURL	url_ver=Z39.88-2004&ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fkatalog.fid-bbi.de%3Agenerator&rft.title=Speech+enhancement+exploiting+the+source-filter+model&rft.date=2020&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&rft.creator=Elshamy%2C+Samy&rft.pub=&rft.format=eBook&rft.language=English

SOLR
_version_	1799795412737982464
author	Elshamy, Samy
author2	Fingscheidt, Tim, Martin, Rainer, Jorswieck, Eduard Axel
author2_role	dgs, dgs, dgs
author2_variant	t f tf, r m rm, e a j ea eaj
author_corporate	Technische Universität Braunschweig
author_corporate_role	dgg
author_facet	Elshamy, Samy, Fingscheidt, Tim, Martin, Rainer, Jorswieck, Eduard Axel, Technische Universität Braunschweig
author_role	aut
author_sort	Elshamy, Samy
author_variant	s e se
building	Library A
collection	GBV-ODiss, sid-183-col-kxpbbi
contents	Imagining everyday life without mobile telephony is nowadays hardly possible. Calls are being made in every thinkable situation and environment. Hence, the microphone will not only pick up the user’s speech but also sound from the surroundings which is likely to impede the understanding of the conversational partner. Modern speech enhancement systems are able to mitigate such effects and most users are not even aware of their existence. In this thesis the development of a modern single-channel speech enhancement approach is presented, which uses the divide and conquer principle to combat environmental noise in microphone signals. Though initially motivated by mobile telephony applications, this approach can be applied whenever speech is to be retrieved from a corrupted signal. The approach uses the so-called source-filter model to divide the problem into two subproblems which are then subsequently conquered by enhancing the source (the excitation signal) and the filter (the spectral envelope) separately. Both enhanced signals are then used to denoise the corrupted signal. The estimation of spectral envelopes has quite some history and some approaches already exist for speech enhancement. However, they typically neglect the excitation signal which leads to the inability of enhancing the fine structure properly. Both individual enhancement approaches exploit benefits of the cepstral domain which offers, e.g., advantageous mathematical properties and straightforward synthesis of excitation-like signals. We investigate traditional model-based schemes like Gaussian mixture models (GMMs), classical signal processing-based, as well as modern deep neural network (DNN)-based approaches in this thesis. The enhanced signals are not used directly to enhance the corrupted signal (e.g., to synthesize a clean speech signal) but as so-called a priori signal-to-noise ratio (SNR) estimate in a traditional statistical speech enhancement system. Such a traditional system consists of a noise power estimator, an a priori SNR estimator, and a spectral weighting rule that is usually driven by the results of the aforementioned estimators and subsequently employed to retrieve the clean speech estimate from the noisy observation. As a result the new approach obtains significantly higher noise attenuation compared to current state-of-the-art systems while maintaining a quite comparable speech component quality and speech intelligibility. In consequence, the overall quality of the enhanced speech signal turns out to be superior as compared to state-of-the-art speech ehnahcement approaches.
ctrlnum	(DE-627)1748448455, (DE-599)KXP1748448455, (OCoLC)1237644181
dewey-full	621.3828, 621.3
dewey-hundreds	600 - Technology (Applied sciences)
dewey-ones	621 - Applied physics
dewey-raw	621.3828, 621.3
dewey-search	621.3828, 621.3
dewey-sort	3621.3828
dewey-tens	620 - Engineering and allied operations
doi_str_mv	10.24355/dbbs.084-202102121510-0
edition	Online-Ausgabe
facet_912a	GBV-ODiss
facet_avail	Online, Free
facet_local_del330	Sprachverarbeitung, Rauschunterdrückung
finc_class_facet	Technik
fincclass_txtF_mv	engineering-electrical, engineering-process, technology, science-computerscience
format	eBook, Thesis
format_access_txtF_mv	Thesis
format_de105	Ebook
format_de14	Book, E-Book
format_de15	Book, E-Book
format_del152	Buch
format_detail_txtF_mv	text-online-monograph-independent-thesis
format_dezi4	e-Book
format_finc	Book, E-Book, Thesis
format_legacy	ElectronicBook
format_legacy_nrw	Book, E-Book
format_nrw	Book, E-Book
format_strict_txtF_mv	E-Thesis
genre	Hochschulschrift (DE-588)4113937-9 (DE-627)105825778 (DE-576)209480580 gnd-content
genre_facet	Hochschulschrift
geogr_code	not assigned
geogr_code_person	not assigned
id	183-1748448455
illustrated	Not Illustrated
imprint	Braunschweig, 2020
imprint_str_mv	Braunschweig, 2020
institution	FID-BBI-DE-23
is_hierarchy_id
is_hierarchy_title
language	English
last_indexed	2024-05-22T22:57:39.131Z
marc024a_ct_mv	urn:nbn:de:gbv:084-2021021215141, 10.24355/dbbs.084-202102121510-0
marc_error	[geogr_code]Unable to make public java.lang.AbstractStringBuilder java.lang.AbstractStringBuilder.append(java.lang.String) accessible: module java.base does not "opens java.lang" to unnamed module @63d9bba
match_str	elshamy2020speechenhancementexploitingthesourcefiltermodel
mega_collection	K10plus Verbundkatalog
oclc_num	1237644181
physical	1 Online-Ressource; Illustrationen, Diagramme
publishDate	2020, , 2021
publishDateSort	2020
publishPlace	Braunschweig, ; Braunschweig
publisher	, : Universitätsbibliothek
record_format	marcfinc
record_id	1748448455
recordtype	marcfinc
rvk_facet	No subject assigned
source_id	183
spelling	Elshamy, Samy VerfasserIn aut, Speech enhancement exploiting the source-filter model von Samy Elshamy, Online-Ausgabe, Braunschweig 2020, Braunschweig Universitätsbibliothek 2021, 1 Online-Ressource Illustrationen, Diagramme, Text txt rdacontent, Computermedien c rdamedia, Online-Ressource cr rdacarrier, Dissertation Technische Universität Carolo-Wilhelmina zu Braunschweig 2020 Kumulative Dissertation, Imagining everyday life without mobile telephony is nowadays hardly possible. Calls are being made in every thinkable situation and environment. Hence, the microphone will not only pick up the user’s speech but also sound from the surroundings which is likely to impede the understanding of the conversational partner. Modern speech enhancement systems are able to mitigate such effects and most users are not even aware of their existence. In this thesis the development of a modern single-channel speech enhancement approach is presented, which uses the divide and conquer principle to combat environmental noise in microphone signals. Though initially motivated by mobile telephony applications, this approach can be applied whenever speech is to be retrieved from a corrupted signal. The approach uses the so-called source-filter model to divide the problem into two subproblems which are then subsequently conquered by enhancing the source (the excitation signal) and the filter (the spectral envelope) separately. Both enhanced signals are then used to denoise the corrupted signal. The estimation of spectral envelopes has quite some history and some approaches already exist for speech enhancement. However, they typically neglect the excitation signal which leads to the inability of enhancing the fine structure properly. Both individual enhancement approaches exploit benefits of the cepstral domain which offers, e.g., advantageous mathematical properties and straightforward synthesis of excitation-like signals. We investigate traditional model-based schemes like Gaussian mixture models (GMMs), classical signal processing-based, as well as modern deep neural network (DNN)-based approaches in this thesis. The enhanced signals are not used directly to enhance the corrupted signal (e.g., to synthesize a clean speech signal) but as so-called a priori signal-to-noise ratio (SNR) estimate in a traditional statistical speech enhancement system. Such a traditional system consists of a noise power estimator, an a priori SNR estimator, and a spectral weighting rule that is usually driven by the results of the aforementioned estimators and subsequently employed to retrieve the clean speech estimate from the noisy observation. As a result the new approach obtains significantly higher noise attenuation compared to current state-of-the-art systems while maintaining a quite comparable speech component quality and speech intelligibility. In consequence, the overall quality of the enhanced speech signal turns out to be superior as compared to state-of-the-art speech ehnahcement approaches., Hochschulschrift (DE-588)4113937-9 (DE-627)105825778 (DE-576)209480580 gnd-content, s (DE-588)4116579-2 (DE-627)105805696 (DE-576)20950272X Sprachverarbeitung gnd, s (DE-588)4177102-3 (DE-627)105351121 (DE-576)209975199 Rauschunterdrückung gnd, (DE-627), Fingscheidt, Tim 1966- AkademischeR BetreuerIn (DE-588)120777134 (DE-627)704981327 (DE-576)181384639 dgs, Martin, Rainer AkademischeR BetreuerIn (DE-588)143794876 (DE-627)704604914 (DE-576)338939474 dgs, Jorswieck, Eduard Axel 1975- AkademischeR BetreuerIn (DE-588)129524913 (DE-627)471010529 (DE-576)297705016 dgs, Technische Universität Braunschweig Grad-verleihende Institution (DE-588)36227-X (DE-627)100834183 (DE-576)19034511X dgg, Braunschweig (DE-588)4008065-1 (DE-627)106370030 (DE-576)208874119 uvp, Erscheint auch als Druck-Ausgabe Elshamy, Samy Speech enhancement exploiting the source-filter model Braunschweig, 2020 1 Band (verschiedene Seitenzählungen) (DE-627)1750539209, https://doi.org/10.24355/dbbs.084-202102121510-0 2022-02-07 Resolving-System kostenfrei, https://doi.org/10.24355/dbbs.084-202102121510-0 2021-05-05 Resolving-System kostenfrei, https://nbn-resolving.org/urn:nbn:de:gbv:084-2021021215141 2021-05-05 Resolving-System kostenfrei, https://d-nb.info/1231992123/34 2021-05-05 Langzeitarchivierung Nationalbibliothek, https://publikationsserver.tu-braunschweig.de/receive/dbbs_mods_00069305 application/pdf 2021-05-05 Verlag kostenfrei kostenfrei
spellingShingle	Elshamy, Samy, Speech enhancement exploiting the source-filter model, Imagining everyday life without mobile telephony is nowadays hardly possible. Calls are being made in every thinkable situation and environment. Hence, the microphone will not only pick up the user’s speech but also sound from the surroundings which is likely to impede the understanding of the conversational partner. Modern speech enhancement systems are able to mitigate such effects and most users are not even aware of their existence. In this thesis the development of a modern single-channel speech enhancement approach is presented, which uses the divide and conquer principle to combat environmental noise in microphone signals. Though initially motivated by mobile telephony applications, this approach can be applied whenever speech is to be retrieved from a corrupted signal. The approach uses the so-called source-filter model to divide the problem into two subproblems which are then subsequently conquered by enhancing the source (the excitation signal) and the filter (the spectral envelope) separately. Both enhanced signals are then used to denoise the corrupted signal. The estimation of spectral envelopes has quite some history and some approaches already exist for speech enhancement. However, they typically neglect the excitation signal which leads to the inability of enhancing the fine structure properly. Both individual enhancement approaches exploit benefits of the cepstral domain which offers, e.g., advantageous mathematical properties and straightforward synthesis of excitation-like signals. We investigate traditional model-based schemes like Gaussian mixture models (GMMs), classical signal processing-based, as well as modern deep neural network (DNN)-based approaches in this thesis. The enhanced signals are not used directly to enhance the corrupted signal (e.g., to synthesize a clean speech signal) but as so-called a priori signal-to-noise ratio (SNR) estimate in a traditional statistical speech enhancement system. Such a traditional system consists of a noise power estimator, an a priori SNR estimator, and a spectral weighting rule that is usually driven by the results of the aforementioned estimators and subsequently employed to retrieve the clean speech estimate from the noisy observation. As a result the new approach obtains significantly higher noise attenuation compared to current state-of-the-art systems while maintaining a quite comparable speech component quality and speech intelligibility. In consequence, the overall quality of the enhanced speech signal turns out to be superior as compared to state-of-the-art speech ehnahcement approaches., Hochschulschrift, Sprachverarbeitung, Rauschunterdrückung
title	Speech enhancement exploiting the source-filter model
title_auth	Speech enhancement exploiting the source-filter model
title_full	Speech enhancement exploiting the source-filter model von Samy Elshamy
title_fullStr	Speech enhancement exploiting the source-filter model von Samy Elshamy
title_full_unstemmed	Speech enhancement exploiting the source-filter model von Samy Elshamy
title_short	Speech enhancement exploiting the source-filter model
title_sort	speech enhancement exploiting the source-filter model
title_unstemmed	Speech enhancement exploiting the source-filter model
topic	Hochschulschrift, Sprachverarbeitung, Rauschunterdrückung
topic_facet	Hochschulschrift, Sprachverarbeitung, Rauschunterdrückung
url	https://doi.org/10.24355/dbbs.084-202102121510-0, https://nbn-resolving.org/urn:nbn:de:gbv:084-2021021215141, https://d-nb.info/1231992123/34, https://publikationsserver.tu-braunschweig.de/receive/dbbs_mods_00069305
urn	urn:nbn:de:gbv:084-2021021215141
work_keys_str_mv	AT elshamysamy speechenhancementexploitingthesourcefiltermodel, AT fingscheidttim speechenhancementexploitingthesourcefiltermodel, AT martinrainer speechenhancementexploitingthesourcefiltermodel, AT jorswieckeduardaxel speechenhancementexploitingthesourcefiltermodel, AT technischeuniversitatbraunschweig speechenhancementexploitingthesourcefiltermodel

Speech enhancement exploiting the source-filter model

Bibliographische Detailangaben

Zugang

Weblinks