dataset

class AnnotatedDataset(groups, *, target, observations, categories, background_category)

Bases: Dataset, AnnotatedSampleableMixin

Annotated dataset.

Parameters:
  • groups (Mapping[str | int, Group]) – The groups of individuals in the dataset.

  • target (Literal['individual', 'dyad']) – The target of the dataset.

  • observations (DataFrame) – The observations of the dataset.

  • categories (tuple[str, ...]) – The categories of the dataset.

  • background_category (str) – The background category of the dataset.

class Dataset(groups, *, target)

Bases: NestedSampleableMixin, SampleableMixin

A dataset is a collection of groups (Group), each of which is a collection of individuals (Individual) or dyads (Dyad).

Parameters:
  • groups (Mapping[str | int, Group]) – A mapping of group identifiers to groups.

  • target (Literal['individual', 'dyad']) – The target type of the dataset.

classmethod REQUIRED_COLUMNS(target=None)

Returns the required columns for annotations with the given target.

Parameters:

target (Literal['individual', 'dyad'] | None, default: None) – The target type for the annotations.

Return type:

tuple[Literal['group'], Literal['actor'], Literal['category'], Literal['start'], Literal['stop']] | tuple[Literal['group'], Literal['actor'], Literal['recipient'], Literal['category'], Literal['start'], Literal['stop']]

Returns:

The required columns for annotations.

annotate(observations, *, categories, background_category)

Annotates the dataset with the given observations.

Parameters:
  • observations (DataFrame) – The observations.

  • categories (tuple[str, ...]) – Categories of the observations.

  • background_category (str) – The background category of the observations.

Return type:

AnnotatedDataset

Returns:

The annotated dataset.

exclude_individuals(individuals, *, subset_actors_only=True)

Exclude individuals from the dataset.

Parameters:
  • individuals (Sequence[str | int | tuple[str | int, str | int]]) – The individuals to exclude.

  • subset_actors_only (bool, default: True) – Whether to exclude only actors if target="dyad". This drops all dyads involving the excluded individuals as actors. Otherwise, all dyads that involve excluded individuals (as either actor or recipient) are dropped.

Return type:

Self

Returns:

The dataset with the excluded individuals.

classmethod from_groups(groups)

Create a new dataset from a groups.

Parameters:

groups (Mapping[str | int, Group]) – The groups to include in the dataset.

Return type:

Self

Returns:

The dataset.

property individuals: tuple[tuple[str | int, str | int], ...]

Returns a tuple of all subjects (individuals in groups) in the dataset.

k_fold(k, *, random_state, subset_actors_only=True)

Yields a generator of k-fold splits.

Parameters:
  • k (int) – The number of folds.

  • random_state (int | None | Generator) – The random state to use for splitting.

  • subset_actors_only (bool, default: True) – Whether to only include actors in the split.

Yields:

A generator of k-fold splits.

Return type:

Generator[tuple[Self, Self], None, None]

See also

exclude_individuals() for more details on the subset_actors_only parameter.

split(size, *, random_state, subset_actors_only=True)

Split the dataset into two subsets.

Parameters:
  • size (int | float) – The size of the first subset. If float, it should be within (0.0, 1.0) interval (exclusive).

  • random_state (int | None | Generator) – The random state for reproducibility.

  • subset_actors_only (bool, default: True) – Whether to only include actors in the split.

Return type:

tuple[Self, Self]

Returns:

A tuple of two subsets.

See also

exclude_individuals() for more details on the subset_actors_only parameter.

include(individual, exclude)

Check if an individual (i.e., subject, individual of a group in a dataset) should be included.

Parameters:
Return type:

bool

Returns:

If the individual should be included.