MoA Inference 노트북 작성

Setting

Install and import libraries

# TabNet
!pip install --no-index --find-links /kaggle/input/pytorchtabnet/pytorch_tabnet-2.0.0-py3-none-any.whl pytorch-tabnet > /dev/null
# Iterative Stratification
!pip install /kaggle/input/iterative-stratification/iterative-stratification-master/ > /dev/null
import os
import random
from shutil import copytree, ignore_patterns
import numpy as np
import pandas as pd
import pickle

import torch
import torch.optim as optim
import torch.nn as nn
import torch.nn.functional as F
from torch.nn.modules.loss import _WeightedLoss
from torch.utils.data import Dataset, DataLoader
from torch.optim.lr_scheduler import ReduceLROnPlateau, StepLR
from sklearn.metrics import roc_auc_score
from pytorch_tabnet.tab_network import TabNet
from pytorch_tabnet.metrics import Metric

from scipy.stats import kurtosis
from sklearn.svm import LinearSVC
from sklearn.feature_selection import SelectFromModel, VarianceThreshold, SelectKBest
from sklearn.preprocessing import StandardScaler, RobustScaler, QuantileTransformer
from sklearn.decomposition import PCA, NMF
from iterstrat.ml_stratifiers import MultilabelStratifiedKFold

import warnings
warnings.filterwarnings('ignore')

전처리 코드

cate2num: category feature → numerical feature make_folds: iterative stratified K Fold for multi-label classification task feature encoding: Rank Gauss, Feature statistics, PCA, Varaiance Threshold, LSVM feature selection

Variance Threshold의 경우 데이터가 달라질 경우 feature selection이 달라지기때문에 weight를 불러 inference를 하는 경우 private testset에서 에러가 날 수 있음. 이러한 문제를 해결하기위해 mask를 따로 만들어 전처리함.

Inference

Configurations

미리 설정한 config 값을 dataframe으로 관리하여 ensemble할 때 사용할 수 있음.

Predict testset

Last updated

Was this helpful?