Public 46th / Private 34th Solution

Method

pre-processing

Add stat feature
Add PCA feature
Variance Threshold
- Feature selector that removes all low-variance features.
Rankgauss
- Assign a spacing between -1 and 1 to the sorted features
- Apply inverse error function → makes a gaussian distribution

modeling

Label Smoothing
Transfer Learning by nonscored for NN, ResNet
Shallow model
- Short epoch, learning until limit before loss is NaN by NN
- n_steps=1, n_shared=1 by TabNet
Thresholdlng NN: input → linear → tanh → NN

post-processing

Ensemble. In particular, Tabnet and NN's ensemble is effective.
2 Stage Stacking by MLP, 1D-CNN, Weight Optimization

What doesn't work

pre-processing
modeling
post-processing

Code Structure

Dataset Structure:

Model Weights
Inference codes for each models
Python Packages

2. Install python packages → Inference stage1 models → get the predictions of each models

3. Stacking (MLP, 1D CNN, Weight Optimization)

Not enough time
- Target Encoding to g-,c- bin's feature
- XGBoost, CatBoost, CNN model for single model (Stage 1)
- GCN model for stacking model (Stage 2)
- Netflix Blending
- PostPredict by LGBMWe noticed that there are columns that NN can't predict, but LGBM can (e.g. cyclooxygenase_inhibitor). Therefore, we came up with the idea of repredicting only the columns that are good at LGBM. But not enough time.

Takeover

Clean Inference Code
- Ensemble using various different models
- Stacking

[Update] Private 3rd Rank with Various Stacking

0.01608 → 0.01599

2D-CNN Stacking

GCN Stacking

Adjacency Matrix: Matrix of ones / (# of classes)^2
Node: (1, 5)

Previous14th-solution

Last updated 4 years ago

Was this helpful?