API Reference

Command Line Program

meds_reader_convert [source_meds_path] [destination_path] --num_threads

Convert a MEDS dataset to a meds_reader SubjectDatabase.

See https://github.com/Medical-Event-Data-Standard/meds for the details of the expected input format.

Parameters:
  • source_meds_path (str) – The path to the source MEDS dataset.

  • destination_path (str) – The path of where to write the resulting meds_reader SubjectDatabase.

  • num_threads (int) – The number of threads to use.


Python Module

class meds_reader.SubjectDatabase(path_to_database: str, num_threads: int = 1)

Open a SubjectDatabase. The path must be from convert_to_meds_reader.

path_to_database: str

The path to the database object

properties: Mapping[str, pyarrow.lib.DataType]

The per-event properties for this dataset

__len__() int

The number of subjects in the database

__getitem__(subject_id: int) meds_reader.Subject

Retrieve a single subject from the database

__iter__() Iterator[int]

Get all subject ids in the database

filter(subject_ids: List[int]) meds_reader.SubjectDatabase

Filter the database to a list of subjects

map(map_func: Callable[[Iterator[meds_reader.Subject]], meds_reader.A]) Iterator[meds_reader.A]

Apply a function to every subject in the database, in a multi-threaded manner.

map_func is a callable that takes an iterable of subjects.

map_with_data(map_func: Callable[[Iterator[Tuple[meds_reader.Subject, Sequence[Any]]]], meds_reader.A], data: pandas.core.frame.DataFrame, assume_sorted: bool = False) Iterator[meds_reader.A]

Apply a function with associated data to every subject in the database, in a multi-threaded manner.

map_func is a callable that takes an iterable of subjects paired with rows from the provided table for that subject_id.

The provided table must have ‘subject_id’ as an integer index that will be used for mapping rows.

Note

This code requires the input to be sorted by subject_id. It will automatically do that sorting for you, but we also provide assume_sorted to allow people to skip that step for already sorted data.

class meds_reader.Subject

A subject consists of a subject_id and a sequence of Events

subject_id: int

The unique identifier for this subject

events: Sequence[meds_reader.Event]

Items that have happened to a subject

class meds_reader.Event

An event represents a single unit of information about a subject. It contains a time and code, and potentially more properties.

time: datetime.datetime

The time the event occurred

code: str

An identifier for the type of event that occured

__getattr__(name: str) Any

Events can contain arbitrary additional properties. This retrieves the specified property, or returns None

__iter__() Iterator[Tuple[str, Any]]

Iterate through the non-None properties for this type.