Public 46th / Private 34th Solution
Method

pre-processing
Add stat feature
Add PCA feature
Variance Threshold
Feature selector that removes all low-variance features.
Rankgauss
Assign a spacing between -1 and 1 to the sorted features
Apply inverse error function → makes a gaussian distribution

modeling
Label Smoothing
Transfer Learning by nonscored for NN, ResNet
Shallow model
Short epoch, learning until limit before loss is NaN by NN
n_steps=1, n_shared=1 by TabNet
Thresholdlng NN: input → linear → tanh → NN
post-processing
Ensemble. In particular, Tabnet and NN's ensemble is effective.
2 Stage Stacking by MLP, 1D-CNN, Weight Optimization
What doesn't work
pre-processing
modeling
post-processing
Code Structure
Dataset Structure:
Model Weights
Inference codes for each models
Python Packages

2. Install python packages → Inference stage1 models → get the predictions of each models


3. Stacking (MLP, 1D CNN, Weight Optimization)

Not enough time
Target Encoding to g-,c- bin's feature
XGBoost, CatBoost, CNN model for single model (Stage 1)
GCN model for stacking model (Stage 2)
Netflix Blending
PostPredict by LGBMWe noticed that there are columns that NN can't predict, but LGBM can (e.g. cyclooxygenase_inhibitor). Therefore, we came up with the idea of repredicting only the columns that are good at LGBM. But not enough time.
Takeover
Clean Inference Code
Ensemble using various different models
Stacking
[Update] Private 3rd Rank with Various Stacking

0.01608 → 0.01599
2D-CNN Stacking

GCN Stacking

Adjacency Matrix: Matrix of ones / (# of classes)^2
Node: (1, 5)
Last updated
Was this helpful?