|
|
|
|
LEADER |
07184cam a2200769 4500 |
001 |
183-1748448455 |
003 |
DE-627 |
005 |
20230110225419.0 |
007 |
cr uuu---uuuuu |
008 |
210216s2020 gw |||||om 00| ||eng c |
016 |
7 |
|
|a 1231992123
|2 DE-101
|
024 |
7 |
|
|a urn:nbn:de:gbv:084-2021021215141
|2 urn
|
024 |
7 |
|
|a 10.24355/dbbs.084-202102121510-0
|2 doi
|
035 |
|
|
|a (DE-627)1748448455
|
035 |
|
|
|a (DE-599)KXP1748448455
|
035 |
|
|
|a (OCoLC)1237644181
|
040 |
|
|
|a DE-627
|b ger
|c DE-627
|e rda
|
041 |
|
|
|a eng
|
044 |
|
|
|c XA-DE
|
082 |
0 |
|
|a 621.3828
|q DE-101
|
082 |
0 |
4 |
|a 621.3
|q DE-101
|
084 |
|
|
|a 54.75
|2 bkl
|
084 |
|
|
|a 53.73
|2 bkl
|
100 |
1 |
|
|a Elshamy, Samy
|e VerfasserIn
|4 aut
|
245 |
1 |
0 |
|a Speech enhancement exploiting the source-filter model
|c von Samy Elshamy
|
250 |
|
|
|a Online-Ausgabe
|
264 |
|
1 |
|a Braunschweig
|c 2020
|
264 |
|
2 |
|a Braunschweig
|b Universitätsbibliothek
|c 2021
|
300 |
|
|
|a 1 Online-Ressource
|b Illustrationen, Diagramme
|
336 |
|
|
|a Text
|b txt
|2 rdacontent
|
337 |
|
|
|a Computermedien
|b c
|2 rdamedia
|
338 |
|
|
|a Online-Ressource
|b cr
|2 rdacarrier
|
502 |
|
|
|b Dissertation
|c Technische Universität Carolo-Wilhelmina zu Braunschweig
|d 2020
|g Kumulative Dissertation
|
520 |
|
|
|a Imagining everyday life without mobile telephony is nowadays hardly possible. Calls are being made in every thinkable situation and environment. Hence, the microphone will not only pick up the user’s speech but also sound from the surroundings which is likely to impede the understanding of the conversational partner. Modern speech enhancement systems are able to mitigate such effects and most users are not even aware of their existence. In this thesis the development of a modern single-channel speech enhancement approach is presented, which uses the divide and conquer principle to combat environmental noise in microphone signals. Though initially motivated by mobile telephony applications, this approach can be applied whenever speech is to be retrieved from a corrupted signal. The approach uses the so-called source-filter model to divide the problem into two subproblems which are then subsequently conquered by enhancing the source (the excitation signal) and the filter (the spectral envelope) separately. Both enhanced signals are then used to denoise the corrupted signal. The estimation of spectral envelopes has quite some history and some approaches already exist for speech enhancement. However, they typically neglect the excitation signal which leads to the inability of enhancing the fine structure properly. Both individual enhancement approaches exploit benefits of the cepstral domain which offers, e.g., advantageous mathematical properties and straightforward synthesis of excitation-like signals. We investigate traditional model-based schemes like Gaussian mixture models (GMMs), classical signal processing-based, as well as modern deep neural network (DNN)-based approaches in this thesis. The enhanced signals are not used directly to enhance the corrupted signal (e.g., to synthesize a clean speech signal) but as so-called a priori signal-to-noise ratio (SNR) estimate in a traditional statistical speech enhancement system. Such a traditional system consists of a noise power estimator, an a priori SNR estimator, and a spectral weighting rule that is usually driven by the results of the aforementioned estimators and subsequently employed to retrieve the clean speech estimate from the noisy observation. As a result the new approach obtains significantly higher noise attenuation compared to current state-of-the-art systems while maintaining a quite comparable speech component quality and speech intelligibility. In consequence, the overall quality of the enhanced speech signal turns out to be superior as compared to state-of-the-art speech ehnahcement approaches.
|
655 |
|
7 |
|a Hochschulschrift
|0 (DE-588)4113937-9
|0 (DE-627)105825778
|0 (DE-576)209480580
|2 gnd-content
|
689 |
0 |
0 |
|D s
|0 (DE-588)4116579-2
|0 (DE-627)105805696
|0 (DE-576)20950272X
|a Sprachverarbeitung
|2 gnd
|
689 |
0 |
1 |
|D s
|0 (DE-588)4177102-3
|0 (DE-627)105351121
|0 (DE-576)209975199
|a Rauschunterdrückung
|2 gnd
|
689 |
0 |
|
|5 (DE-627)
|
700 |
1 |
|
|a Fingscheidt, Tim
|d 1966-
|e AkademischeR BetreuerIn
|0 (DE-588)120777134
|0 (DE-627)704981327
|0 (DE-576)181384639
|4 dgs
|
700 |
1 |
|
|a Martin, Rainer
|e AkademischeR BetreuerIn
|0 (DE-588)143794876
|0 (DE-627)704604914
|0 (DE-576)338939474
|4 dgs
|
700 |
1 |
|
|a Jorswieck, Eduard Axel
|d 1975-
|e AkademischeR BetreuerIn
|0 (DE-588)129524913
|0 (DE-627)471010529
|0 (DE-576)297705016
|4 dgs
|
710 |
2 |
|
|a Technische Universität Braunschweig
|e Grad-verleihende Institution
|0 (DE-588)36227-X
|0 (DE-627)100834183
|0 (DE-576)19034511X
|4 dgg
|
751 |
|
|
|a Braunschweig
|0 (DE-588)4008065-1
|0 (DE-627)106370030
|0 (DE-576)208874119
|4 uvp
|
776 |
0 |
8 |
|i Erscheint auch als
|n Druck-Ausgabe
|a Elshamy, Samy
|t Speech enhancement exploiting the source-filter model
|d Braunschweig, 2020
|h 1 Band (verschiedene Seitenzählungen)
|w (DE-627)1750539209
|
856 |
4 |
0 |
|u https://doi.org/10.24355/dbbs.084-202102121510-0
|v 2022-02-07
|x Resolving-System
|z kostenfrei
|
856 |
4 |
0 |
|u https://doi.org/10.24355/dbbs.084-202102121510-0
|v 2021-05-05
|x Resolving-System
|z kostenfrei
|
856 |
4 |
0 |
|u https://nbn-resolving.org/urn:nbn:de:gbv:084-2021021215141
|v 2021-05-05
|x Resolving-System
|z kostenfrei
|
856 |
4 |
0 |
|u https://d-nb.info/1231992123/34
|v 2021-05-05
|x Langzeitarchivierung Nationalbibliothek
|
856 |
4 |
0 |
|u https://publikationsserver.tu-braunschweig.de/receive/dbbs_mods_00069305
|q application/pdf
|v 2021-05-05
|x Verlag
|z kostenfrei
|z kostenfrei
|
912 |
|
|
|a GBV-ODiss
|
924 |
1 |
|
|a 3855517843
|b DE-84
|9 84
|c GBV
|d d
|k https://doi.org/10.24355/dbbs.084-202102121510-0
|
924 |
1 |
|
|a 3932754727
|b DE-46
|9 46
|c GBV
|d d
|k https://nbn-resolving.de/urn:nbn:de:gbv:084-2021021215141
|
924 |
1 |
|
|a 3932827430
|b DE-18
|9 18
|c GBV
|d d
|k https://nbn-resolving.de/urn:nbn:de:gbv:084-2021021215141
|
924 |
1 |
|
|a 3932885902
|b DE-830
|9 830
|c GBV
|d d
|k https://nbn-resolving.de/urn:nbn:de:gbv:084-2021021215141
|
924 |
1 |
|
|a 3932960955
|b DE-7
|9 7
|c GBV
|d d
|k https://nbn-resolving.de/urn:nbn:de:gbv:084-2021021215141
|
924 |
1 |
|
|a 3933020867
|b DE-705
|9 705
|c GBV
|d d
|k https://nbn-resolving.de/urn:nbn:de:gbv:084-2021021215141
|
924 |
1 |
|
|a 3933078253
|b DE-Wim2
|9 Wim 2
|c GBV
|d d
|k https://nbn-resolving.de/urn:nbn:de:gbv:084-2021021215141
|
924 |
1 |
|
|a 4168925641
|b DE-3
|9 3
|c GBV
|d d
|k https://nbn-resolving.org/urn:nbn:de:gbv:084-2021021215141
|
924 |
1 |
|
|a 3933435188
|b DE-841
|9 841
|c GBV
|d d
|k https://nbn-resolving.de/urn:nbn:de:gbv:084-2021021215141
|
924 |
1 |
|
|a 3933149738
|b DE-Luen4
|9 Lün 4
|c GBV
|d d
|k https://nbn-resolving.de/urn:nbn:de:gbv:084-2021021215141
|
924 |
1 |
|
|a 3933191963
|b DE-959
|9 959
|c GBV
|d d
|k https://nbn-resolving.de/urn:nbn:de:gbv:084-2021021215141
|
924 |
1 |
|
|a 393327267X
|b DE-960
|9 960
|c GBV
|d d
|k https://nbn-resolving.de/urn:nbn:de:gbv:084-2021021215141
|
924 |
1 |
|
|a 3933389259
|b DE-960-3
|9 960/3
|c GBV
|d d
|k https://nbn-resolving.de/urn:nbn:de:gbv:084-2021021215141
|
936 |
b |
k |
|a 54.75
|j Sprachverarbeitung
|x Informatik
|q SEPA
|0 (DE-627)10640587X
|
936 |
b |
k |
|a 53.73
|j Nachrichtenübertragung
|q SEPA
|0 (DE-627)106417657
|
951 |
|
|
|a BO
|
980 |
|
|
|a 1748448455
|b 183
|c sid-183-col-kxpbbi
|
SOLR
_version_ |
1799795412737982464 |
author |
Elshamy, Samy |
author2 |
Fingscheidt, Tim, Martin, Rainer, Jorswieck, Eduard Axel |
author2_role |
dgs, dgs, dgs |
author2_variant |
t f tf, r m rm, e a j ea eaj |
author_corporate |
Technische Universität Braunschweig |
author_corporate_role |
dgg |
author_facet |
Elshamy, Samy, Fingscheidt, Tim, Martin, Rainer, Jorswieck, Eduard Axel, Technische Universität Braunschweig |
author_role |
aut |
author_sort |
Elshamy, Samy |
author_variant |
s e se |
building |
Library A |
collection |
GBV-ODiss, sid-183-col-kxpbbi |
contents |
Imagining everyday life without mobile telephony is nowadays hardly possible. Calls are being made in every thinkable situation and environment. Hence, the microphone will not only pick up the user’s speech but also sound from the surroundings which is likely to impede the understanding of the conversational partner. Modern speech enhancement systems are able to mitigate such effects and most users are not even aware of their existence. In this thesis the development of a modern single-channel speech enhancement approach is presented, which uses the divide and conquer principle to combat environmental noise in microphone signals. Though initially motivated by mobile telephony applications, this approach can be applied whenever speech is to be retrieved from a corrupted signal. The approach uses the so-called source-filter model to divide the problem into two subproblems which are then subsequently conquered by enhancing the source (the excitation signal) and the filter (the spectral envelope) separately. Both enhanced signals are then used to denoise the corrupted signal. The estimation of spectral envelopes has quite some history and some approaches already exist for speech enhancement. However, they typically neglect the excitation signal which leads to the inability of enhancing the fine structure properly. Both individual enhancement approaches exploit benefits of the cepstral domain which offers, e.g., advantageous mathematical properties and straightforward synthesis of excitation-like signals. We investigate traditional model-based schemes like Gaussian mixture models (GMMs), classical signal processing-based, as well as modern deep neural network (DNN)-based approaches in this thesis. The enhanced signals are not used directly to enhance the corrupted signal (e.g., to synthesize a clean speech signal) but as so-called a priori signal-to-noise ratio (SNR) estimate in a traditional statistical speech enhancement system. Such a traditional system consists of a noise power estimator, an a priori SNR estimator, and a spectral weighting rule that is usually driven by the results of the aforementioned estimators and subsequently employed to retrieve the clean speech estimate from the noisy observation. As a result the new approach obtains significantly higher noise attenuation compared to current state-of-the-art systems while maintaining a quite comparable speech component quality and speech intelligibility. In consequence, the overall quality of the enhanced speech signal turns out to be superior as compared to state-of-the-art speech ehnahcement approaches. |
ctrlnum |
(DE-627)1748448455, (DE-599)KXP1748448455, (OCoLC)1237644181 |
dewey-full |
621.3828, 621.3 |
dewey-hundreds |
600 - Technology (Applied sciences) |
dewey-ones |
621 - Applied physics |
dewey-raw |
621.3828, 621.3 |
dewey-search |
621.3828, 621.3 |
dewey-sort |
3621.3828 |
dewey-tens |
620 - Engineering and allied operations |
doi_str_mv |
10.24355/dbbs.084-202102121510-0 |
edition |
Online-Ausgabe |
facet_912a |
GBV-ODiss |
facet_avail |
Online, Free |
facet_local_del330 |
Sprachverarbeitung, Rauschunterdrückung |
finc_class_facet |
Technik |
fincclass_txtF_mv |
engineering-electrical, engineering-process, technology, science-computerscience |
format |
eBook, Thesis |
format_access_txtF_mv |
Thesis |
format_de105 |
Ebook |
format_de14 |
Book, E-Book |
format_de15 |
Book, E-Book |
format_del152 |
Buch |
format_detail_txtF_mv |
text-online-monograph-independent-thesis |
format_dezi4 |
e-Book |
format_finc |
Book, E-Book, Thesis |
format_legacy |
ElectronicBook |
format_legacy_nrw |
Book, E-Book |
format_nrw |
Book, E-Book |
format_strict_txtF_mv |
E-Thesis |
genre |
Hochschulschrift (DE-588)4113937-9 (DE-627)105825778 (DE-576)209480580 gnd-content |
genre_facet |
Hochschulschrift |
geogr_code |
not assigned |
geogr_code_person |
not assigned |
id |
183-1748448455 |
illustrated |
Not Illustrated |
imprint |
Braunschweig, 2020 |
imprint_str_mv |
Braunschweig, 2020 |
institution |
FID-BBI-DE-23 |
is_hierarchy_id |
|
is_hierarchy_title |
|
language |
English |
last_indexed |
2024-05-22T22:57:39.131Z |
marc024a_ct_mv |
urn:nbn:de:gbv:084-2021021215141, 10.24355/dbbs.084-202102121510-0 |
marc_error |
[geogr_code]Unable to make public java.lang.AbstractStringBuilder java.lang.AbstractStringBuilder.append(java.lang.String) accessible: module java.base does not "opens java.lang" to unnamed module @63d9bba |
match_str |
elshamy2020speechenhancementexploitingthesourcefiltermodel |
mega_collection |
K10plus Verbundkatalog |
oclc_num |
1237644181 |
physical |
1 Online-Ressource; Illustrationen, Diagramme |
publishDate |
2020, , 2021 |
publishDateSort |
2020 |
publishPlace |
Braunschweig, ; Braunschweig |
publisher |
, : Universitätsbibliothek |
record_format |
marcfinc |
record_id |
1748448455 |
recordtype |
marcfinc |
rvk_facet |
No subject assigned |
source_id |
183 |
spelling |
Elshamy, Samy VerfasserIn aut, Speech enhancement exploiting the source-filter model von Samy Elshamy, Online-Ausgabe, Braunschweig 2020, Braunschweig Universitätsbibliothek 2021, 1 Online-Ressource Illustrationen, Diagramme, Text txt rdacontent, Computermedien c rdamedia, Online-Ressource cr rdacarrier, Dissertation Technische Universität Carolo-Wilhelmina zu Braunschweig 2020 Kumulative Dissertation, Imagining everyday life without mobile telephony is nowadays hardly possible. Calls are being made in every thinkable situation and environment. Hence, the microphone will not only pick up the user’s speech but also sound from the surroundings which is likely to impede the understanding of the conversational partner. Modern speech enhancement systems are able to mitigate such effects and most users are not even aware of their existence. In this thesis the development of a modern single-channel speech enhancement approach is presented, which uses the divide and conquer principle to combat environmental noise in microphone signals. Though initially motivated by mobile telephony applications, this approach can be applied whenever speech is to be retrieved from a corrupted signal. The approach uses the so-called source-filter model to divide the problem into two subproblems which are then subsequently conquered by enhancing the source (the excitation signal) and the filter (the spectral envelope) separately. Both enhanced signals are then used to denoise the corrupted signal. The estimation of spectral envelopes has quite some history and some approaches already exist for speech enhancement. However, they typically neglect the excitation signal which leads to the inability of enhancing the fine structure properly. Both individual enhancement approaches exploit benefits of the cepstral domain which offers, e.g., advantageous mathematical properties and straightforward synthesis of excitation-like signals. We investigate traditional model-based schemes like Gaussian mixture models (GMMs), classical signal processing-based, as well as modern deep neural network (DNN)-based approaches in this thesis. The enhanced signals are not used directly to enhance the corrupted signal (e.g., to synthesize a clean speech signal) but as so-called a priori signal-to-noise ratio (SNR) estimate in a traditional statistical speech enhancement system. Such a traditional system consists of a noise power estimator, an a priori SNR estimator, and a spectral weighting rule that is usually driven by the results of the aforementioned estimators and subsequently employed to retrieve the clean speech estimate from the noisy observation. As a result the new approach obtains significantly higher noise attenuation compared to current state-of-the-art systems while maintaining a quite comparable speech component quality and speech intelligibility. In consequence, the overall quality of the enhanced speech signal turns out to be superior as compared to state-of-the-art speech ehnahcement approaches., Hochschulschrift (DE-588)4113937-9 (DE-627)105825778 (DE-576)209480580 gnd-content, s (DE-588)4116579-2 (DE-627)105805696 (DE-576)20950272X Sprachverarbeitung gnd, s (DE-588)4177102-3 (DE-627)105351121 (DE-576)209975199 Rauschunterdrückung gnd, (DE-627), Fingscheidt, Tim 1966- AkademischeR BetreuerIn (DE-588)120777134 (DE-627)704981327 (DE-576)181384639 dgs, Martin, Rainer AkademischeR BetreuerIn (DE-588)143794876 (DE-627)704604914 (DE-576)338939474 dgs, Jorswieck, Eduard Axel 1975- AkademischeR BetreuerIn (DE-588)129524913 (DE-627)471010529 (DE-576)297705016 dgs, Technische Universität Braunschweig Grad-verleihende Institution (DE-588)36227-X (DE-627)100834183 (DE-576)19034511X dgg, Braunschweig (DE-588)4008065-1 (DE-627)106370030 (DE-576)208874119 uvp, Erscheint auch als Druck-Ausgabe Elshamy, Samy Speech enhancement exploiting the source-filter model Braunschweig, 2020 1 Band (verschiedene Seitenzählungen) (DE-627)1750539209, https://doi.org/10.24355/dbbs.084-202102121510-0 2022-02-07 Resolving-System kostenfrei, https://doi.org/10.24355/dbbs.084-202102121510-0 2021-05-05 Resolving-System kostenfrei, https://nbn-resolving.org/urn:nbn:de:gbv:084-2021021215141 2021-05-05 Resolving-System kostenfrei, https://d-nb.info/1231992123/34 2021-05-05 Langzeitarchivierung Nationalbibliothek, https://publikationsserver.tu-braunschweig.de/receive/dbbs_mods_00069305 application/pdf 2021-05-05 Verlag kostenfrei kostenfrei |
spellingShingle |
Elshamy, Samy, Speech enhancement exploiting the source-filter model, Imagining everyday life without mobile telephony is nowadays hardly possible. Calls are being made in every thinkable situation and environment. Hence, the microphone will not only pick up the user’s speech but also sound from the surroundings which is likely to impede the understanding of the conversational partner. Modern speech enhancement systems are able to mitigate such effects and most users are not even aware of their existence. In this thesis the development of a modern single-channel speech enhancement approach is presented, which uses the divide and conquer principle to combat environmental noise in microphone signals. Though initially motivated by mobile telephony applications, this approach can be applied whenever speech is to be retrieved from a corrupted signal. The approach uses the so-called source-filter model to divide the problem into two subproblems which are then subsequently conquered by enhancing the source (the excitation signal) and the filter (the spectral envelope) separately. Both enhanced signals are then used to denoise the corrupted signal. The estimation of spectral envelopes has quite some history and some approaches already exist for speech enhancement. However, they typically neglect the excitation signal which leads to the inability of enhancing the fine structure properly. Both individual enhancement approaches exploit benefits of the cepstral domain which offers, e.g., advantageous mathematical properties and straightforward synthesis of excitation-like signals. We investigate traditional model-based schemes like Gaussian mixture models (GMMs), classical signal processing-based, as well as modern deep neural network (DNN)-based approaches in this thesis. The enhanced signals are not used directly to enhance the corrupted signal (e.g., to synthesize a clean speech signal) but as so-called a priori signal-to-noise ratio (SNR) estimate in a traditional statistical speech enhancement system. Such a traditional system consists of a noise power estimator, an a priori SNR estimator, and a spectral weighting rule that is usually driven by the results of the aforementioned estimators and subsequently employed to retrieve the clean speech estimate from the noisy observation. As a result the new approach obtains significantly higher noise attenuation compared to current state-of-the-art systems while maintaining a quite comparable speech component quality and speech intelligibility. In consequence, the overall quality of the enhanced speech signal turns out to be superior as compared to state-of-the-art speech ehnahcement approaches., Hochschulschrift, Sprachverarbeitung, Rauschunterdrückung |
title |
Speech enhancement exploiting the source-filter model |
title_auth |
Speech enhancement exploiting the source-filter model |
title_full |
Speech enhancement exploiting the source-filter model von Samy Elshamy |
title_fullStr |
Speech enhancement exploiting the source-filter model von Samy Elshamy |
title_full_unstemmed |
Speech enhancement exploiting the source-filter model von Samy Elshamy |
title_short |
Speech enhancement exploiting the source-filter model |
title_sort |
speech enhancement exploiting the source-filter model |
title_unstemmed |
Speech enhancement exploiting the source-filter model |
topic |
Hochschulschrift, Sprachverarbeitung, Rauschunterdrückung |
topic_facet |
Hochschulschrift, Sprachverarbeitung, Rauschunterdrückung |
url |
https://doi.org/10.24355/dbbs.084-202102121510-0, https://nbn-resolving.org/urn:nbn:de:gbv:084-2021021215141, https://d-nb.info/1231992123/34, https://publikationsserver.tu-braunschweig.de/receive/dbbs_mods_00069305 |
urn |
urn:nbn:de:gbv:084-2021021215141 |
work_keys_str_mv |
AT elshamysamy speechenhancementexploitingthesourcefiltermodel, AT fingscheidttim speechenhancementexploitingthesourcefiltermodel, AT martinrainer speechenhancementexploitingthesourcefiltermodel, AT jorswieckeduardaxel speechenhancementexploitingthesourcefiltermodel, AT technischeuniversitatbraunschweig speechenhancementexploitingthesourcefiltermodel |