Munich Sentence Database

About This Database

This is an online interface presenting sentence completion norms described in the following preprint:

Sterner, E. F., Stadler, M., & Knolle, F. (2025, October 26). Munich Sentence (MuSe) Database – Completion norms and audio recordings for 619 German sentences. https://osf.io/preprints/psyarxiv/evr24_v1

Prediction is a core feature of language, which is widely studied across research domains. The Munich Sentence (MuSe) database enhances reproducibility by providing sentence completion norms for 619 German sentences, including cloze probabilities and entropy estimates from up to 232 participants. Sentence completions were collected in two online studies in which participants completed sentence beginnings with a single-word response after either hearing (auditory sample, N = 133) or reading (visual sample, N = 98) the sentence beginning. All responses were manually preprocessed to correct typos and spelling mistakes and to label grammatical errors, proper nouns, and singular and plural variants of the same response. In addition to the sentence norms, we provide trial-level data with participant-level demographic information and subclinical autistic and schizotypal trait measures. Together with open access R-Scripts or our webtool, this allows for tailoring the cleaning and norming steps to integrate individual difference measures. For a subset of 479 sentence beginnings, the database also includes professional audio recordings of sentence beginnings which can be flexibly combined with 531 recordings of unique sentence-final words and implemented in auditory language paradigms.

All material inclucing preprocessing and analysis scripts are freely accessible via the Open Science Framework (https://osf.io/ktnze/overview).

If you are using any of the material please cite:

Sterner, E. F., Stadler, M., & Knolle, F. (2025, October 26). Munich Sentence (MuSe) Database – Completion norms and audio recordings for 619 German sentences. https://osf.io/preprints/psyarxiv/evr24_v1

The interface was designed by Chuyang Wang (chuyang.wang(at)tum.de).

1 Customize Data Cleaning and Norming Parameters

Use Pre-Cleaned and Pre-Normed Data

Customize Data Cleaning and Norming

The parameters below show the default values used for pre-cleaned data. To customize these parameters, toggle "Customize Data Cleaning and Norming" above.

The customize data cleaning and norming feature calculates the data on-the-fly, thus it could take up to a minute to load the results depending on the complexity of your filters and the server load.

Customize Cleaning Parameters

Participant Exclusion Criteria

Maximum Proportion of Missing Responses (%)

Maximum Proportion of Grammatical Errors (%)

Optional Demographic Filters

Native Language

Presentation Mode

Questionnaire Data

Trial-Based Cleaning

Exclude Missing Responses (999)

Exclude Grammatically Errors (777)

Exclude Proper Nouns (555)

Exclude Neologisms (444)

Customize Norming Parameters

Keep Singular/Plural Separate

Harmonize Singular/Plural

2 Display and Filter Normed Data

Use the filters below to subset the normed dataset based on specific characteristics of interest. Select any combination of filters to tailor the output to your needs. If no filters are selected, the full dataset will be included in your download. Press Search to display your normed dataset.

Results per Page

3 Results

Include Response Details

Include Mapping to Audio Recording Entries

0 results

Loading results...