BELA API reference¶

For most people, bela.read_eaf() is the first thing to look at. This function returns a bela.Bela2 object for manipulating a BELA transcript directly:

>>> import bela
>>> b2 = bela.read_eaf("my_bela_filename.eaf")

Now you can use the created b2 object to process BELA data.

>>> for person in b2.persons:
>>>     print(person.name, person.code)
>>>     for u in person.utterances:
>>>         print(u, u.from_ts, u.to_ts, u.duration)
>>>         if u.translation:
>>>             print(u.translation)
>>>         for c in u.chunks:
>>>             print(f"  - {c} [{c.language}]")

The bela module¶

bela.read_eaf(eaf_path, **kwargs)¶

Read an EAF file as a Bela2 object

Parameters: eaf_path (str-like object or a Path object) – Path to the EAF file
Returns: A Bela2 object
Return type: bela.Bela2

bela.from_elan(elan, eaf_path=':memory:', **kwargs)¶: Create a BELA-con version 2.x object from a speach.elan.ELANDoc object

The lex module¶

This module provides lexicon analysis functions (i.e. counting tokens, calculating class-token ratio, et cetera). New users should start with bela.lex.CorpusLexicalAnalyser.

>>> from bela.lex import CorpusLexicalAnalyser
>>> analyser = CorpusLexicalAnalyser()
>>> for person in b2.persons:
>>>     for u in person.utterances:
>>>         analyser.add(u.text, u.language, source=source, speaker=person.code)
>>> analyser.analyse()

class bela.lex.CorpusLexicalAnalyser(filepath=':memory:', lang_lex_map=None, word_only=False, lemmatizer=True, **kwargs)[source]¶

Analyse a corpus text

analyse(external_tokenizer=True)[source]¶: Analyse all available profiles (i.e. speakers)

read(**kwargs)[source]¶: Read the CSV file content specified by self.filepath

to_dict()[source]¶: Export analysed result as a JSON-ready object

BELA-con version 2.0 API¶

The official Bela convention. By default, this should be used for new transcripts.

class bela.Bela2(elan, path=':memory:', allow_empty=False, nlp_tokenizer=False, word_only=True, ellipsis=True, validate_baby_languages=False, ansi_languages=('English', 'Vocal Sounds', 'Malay', 'Red Dot', ':v:airstream', ':v:crying', ':v:vocalizations'), auto_tokenize=True, split_punc=True, remove_punc=True, **kwargs)[source]¶

BELA-convention version 2

find_turns(threshold=1500)[source]¶

Find potential turn-takings

Parameters: threshold (float) – Delay between utterances in milliseconds
Returns: List of utterance pairs (2-tuple) (from utterance, to utterance object)

static from_elan(elan, eaf_path=':memory:', **kwargs)[source]¶: Create a BELA-con version 2.x object from a speach.elan.ELANDoc object

parse_name(tier)[source]¶

(Internal) Parse participant name and tier type from a tier object and then update the tier object

This function is internal and should not be used outside of this class.

Parameters: tier (speach.elan.ELANTier) – The tier object to parse

static read_eaf(eaf_path, **kwargs)[source]¶

Read an EAF file as a Bela2 object

Parameters: eaf_path (str-like object or a Path object) – Path to the EAF file
Returns: A Bela2 object
Return type: bela.Bela2

to_language_mix(to_ts=None, auto_compute=True)[source]¶: Collapse utterances to generate a language mix timeline

tokenize()[source]¶: tokenize all utterances

property annotation¶: Get an annotation object by ID

property participant_codes¶: Immutable list of participant codes

property person_map¶: Map participant (i.e. person code) to person object

property persons¶: All Person objects in this BELA object

property roots¶: Direct access to all underlying ELAN root tiers

BELA-con version 1.0 API¶

Bela1 is deprecated from Mar 2020. It is still available for backward compatible only. Please do not use it for anything other than BLIP’s PILOT10 corpus.

class bela.Bela1[source]¶

This class represent BELA convention version 1

static read(filepath, autotag=True)[source]¶: Read ELAN csv file

to_language_mix(to_ts=None, auto_compute=True)[source]¶: Collapse utterances to generate a language mix timeline

BELA API reference¶

The bela module¶

The lex module¶

BELA-con version 2.0 API¶

BELA-con version 1.0 API¶

BELA

Navigation

Related Topics