Search alternatives:
model optimization » codon optimization (Expand Search), global optimization (Expand Search), based optimization (Expand Search)
data optimization » path optimization (Expand Search), dose optimization (Expand Search), art optimization (Expand Search)
image model » damage model (Expand Search), primate model (Expand Search), climate model (Expand Search)
class data » claims data (Expand Search)
model optimization » codon optimization (Expand Search), global optimization (Expand Search), based optimization (Expand Search)
data optimization » path optimization (Expand Search), dose optimization (Expand Search), art optimization (Expand Search)
image model » damage model (Expand Search), primate model (Expand Search), climate model (Expand Search)
class data » claims data (Expand Search)
-
41
Algoritmo de clasificación de expresiones de odio por tipos en español (Algorithm for classifying hate expressions by type in Spanish)
Published 2024“…</p><h2>Model Architecture</h2><p dir="ltr">The model is based on <code>pysentimiento/robertuito-base-uncased</code> with the following modifications:</p><ul><li>A dense classification layer was added over the base model</li><li>Uses input IDs and attention masks as inputs</li><li>Generates a multi-class classification with 5 hate categories</li></ul><h2>Dataset</h2><p dir="ltr"><b>HATEMEDIA Dataset</b>: Custom hate speech dataset with categorization by type:</p><ul><li><b>Labels</b>: 5 hate type categories (0-4)</li><li><b>Preprocessing</b>:</li><li>Null values removed from text and labels</li><li>Reindexing and relabeling (original labels are adjusted by subtracting 1)</li><li>Exclusion of category 2 during training</li><li>Conversion of category 5 to category 2</li></ul><h2>Training Process</h2><h3>Configuration</h3><ul><li><b>Batch size</b>: 128</li><li><b>Epoches</b>: 5</li><li><b>Learning rate</b>: 2e-5 with 10% warmup steps</li><li><b>Early stopping</b> with patience=2</li><li><b>Class weights</b>: Balanced to handle class imbalance</li></ul><h3>Custom Metrics</h3><ul><li>Recall for specific classes (focus on class 2)</li><li>Precision for specific classes (focus on class 3)</li><li>F1-score (weighted)</li><li>AUC-PR</li><li>Recall at precision=0.6 (class 3)</li><li>Precision at recall=0.6 (class 2)</li></ul><h2>Evaluation Metrics</h2><p dir="ltr">The model is evaluated using:</p><ul><li>Macro recall, precision, and F1-score</li><li>One-vs-Rest AUC</li><li>Accuracy</li><li>Per-class metrics</li><li>Confusion matrix</li><li>Full classification report</li></ul><h2>Technical Features</h2><h3>Data Preprocessing</h3><ul><li><b>Tokenization</b>: Maximum length of 128 tokens (truncation and padding)</li><li><b>Encoding of labels</b>: One-hot encoding for multi-class classification</li><li><b>Data split</b>: 80% training, 10% validation, 10% testing</li></ul><h3>Optimization</h3><ul><li><b>Optimizer</b>: Adam with linear warmup scheduling</li><li><b>Loss function</b>: Categorical Crossentropy (from_logits=True)</li><li><b>Imbalance handling</b>: Class weights computed automatically</li></ul><h2>Requirements</h2><p dir="ltr">The following Python packages are required:</p><ul><li>TensorFlow</li><li>Transformers</li><li>scikit-learn</li><li>pandas</li><li>datasets</li><li>matplotlib</li><li>seaborn</li><li>numpy</li></ul><h2>Usage</h2><ol><li><b>Data format</b>:</li></ol><ul><li>CSV file or Pandas DataFrame</li><li>Required column name: <code>text</code> (string type)</li><li>Required column name: Data type label (integer type, 0-4) - optional for evaluation</li></ul><ol><li><b>Text preprocessing</b>:</li></ol><ul><li>Automatic tokenization with a maximum length of 128 tokens</li><li>Long texts will be automatically truncated</li><li>Handling of special characters, URLs, and emojis included</li></ul><ol><li><b>Label encoding</b>:</li></ol><ul><li>The model classifies hate speech into 5 categories (0-4)</li><li><code>0</code>: Political hatred: Expressions directed against individuals or groups based on political orientation.…”
-
42
Generalized Tensor Decomposition With Features on Multiple Modes
Published 2021“…An efficient alternating optimization algorithm with provable spectral initialization is further developed. …”
-
43
Presentation_1_Modified GAN Augmentation Algorithms for the MRI-Classification of Myocardial Scar Tissue in Ischemic Cardiomyopathy.PPTX
Published 2021“…Currently, there are no optimized deep-learning algorithms for the automated classification of scarred vs. normal myocardium. …”
-
44
DataSheet_1_Multi-Parametric MRI-Based Radiomics Models for Predicting Molecular Subtype and Androgen Receptor Expression in Breast Cancer.docx
Published 2021“…We applied several feature selection strategies including the least absolute shrinkage and selection operator (LASSO), and recursive feature elimination (RFE), the maximum relevance minimum redundancy (mRMR), Boruta and Pearson correlation analysis, to select the most optimal features. We then built 120 diagnostic models using distinct classification algorithms and feature sets divided by MRI sequences and selection strategies to predict molecular subtype and AR expression of breast cancer in the testing dataset of leave-one-out cross-validation (LOOCV). …”
-
45
Table_1_An efficient decision support system for leukemia identification utilizing nature-inspired deep feature optimization.pdf
Published 2024“…To optimize feature selection, a customized binary Grey Wolf Algorithm is utilized, achieving an impressive 80% reduction in feature size while preserving key discriminative information. …”
-
46
DataSheet_1_Exploring deep learning radiomics for classifying osteoporotic vertebral fractures in X-ray images.docx
Published 2024“…Logistic regression emerged as the optimal machine learning algorithm for both DLR models. …”
-
47
Thesis-RAMIS-Figs_Slides
Published 2024“…Importantly, this strategy locates samples adaptively on the transition between facies which improves the performance of conventional \emph{<i>MPS</i>} algorithms. In conclusion, this work shows that preferential sampling can contribute in \emph{<i>MPS</i>} even at very small sampling regimes and, as a corollary, demonstrates that prior models (obtained form a training image) can be used effectively not only to simulate non-sensed variables of the field, but to decide where to measure next.…”
-
48
PathOlOgics_RBCs Python Scripts.zip
Published 2023“…</p><p dir="ltr">To assess the consistency, diversity, and complexity of the processed data, the Uniform Manifold Approximation and Projection (UMAP) technique was employed to investigate the structural relationships among the various classes (see PathOlOgics_script_3; UMAP visualizations). …”
-
49
-
50
-
51
-
52
DataSheet_1_Near infrared spectroscopy for cooking time classification of cassava genotypes.docx
Published 2024“…Cooking data were classified into binary and multiclass variables (CT4C and CT6C). …”
-
53
Variable Selection with Multiply-Imputed Datasets: Choosing Between Stacked and Grouped Methods
Published 2022“…Building on existing work, we (i) derive and implement efficient cyclic coordinate descent and majorization-minimization optimization algorithms for continuous and binary outcome data, (ii) incorporate adaptive shrinkage penalties, (iii) compare these methods through simulation, and (iv) develop an R package <i>miselect</i>. …”
-
54
Supplementary Material 8
Published 2025“…</li><li><b>XGboost: </b>An optimized gradient boosting algorithm that efficiently handles large genomic datasets, commonly used for high-accuracy predictions in <i>E. coli</i> classification.…”
-
55
Table_1_Near infrared spectroscopy for cooking time classification of cassava genotypes.docx
Published 2024“…Cooking data were classified into binary and multiclass variables (CT4C and CT6C). …”
-
56
Machine Learning-Ready Dataset for Cytotoxicity Prediction of Metal Oxide Nanoparticles
Published 2025“…</p><p dir="ltr"><b>Applications and Model Compatibility:</b></p><p dir="ltr">The dataset is optimized for use in supervised learning workflows and has been tested with algorithms such as:</p><p dir="ltr">Gradient Boosting Machines (GBM),</p><p dir="ltr">Support Vector Machines (SVM-RBF),</p><p dir="ltr">Random Forests, and</p><p dir="ltr">Principal Component Analysis (PCA) for feature reduction.…”