Text this: Pre-training on Multi-modal for Improved Persona Detection Using Multi Datasets