Text this: Dataset size and splitting.