Graph Masked Autoencoder Enhanced Predictor for Neural Architecture SearchInference Serving

Posted on December 30, 2022   1 minute read ∼ Filed in  : 

Main Idea

Problem

It is challenging to train the predictor with few architecture evaluations for efficient NAS.

Most existing work uses something other than untrained(unlabelled) architecture, thus missing an opportunity for improvements.

Solution

The paper constructs an architecture performance predictor by

  1. Use many unlabeled data (untrained architectures) in search space to pre-train the model,
    1. Encoder: Use GAT to take architecture graph as inputs, and then outputs a new graph

    2. Decoder: Linear projection to decode the vertex features and predict the operation type.

    3. Obj function: \(L = (1/N_{mv}) * \sum_{mv} \sum_{c=1}^{c}y_{ic} *log(ProbabilityOfC_{i})\) y is one if the real category of the vertex i is c, 0 otherwise.

      mv is masked vertex

  2. Then fine-tune the model with labeled datasets.

    1. Ignore the decoder and only uses the encoder with pre-trained parameters and fully connected layers.

      The tunning will update the fully connected layer’s weight only.

    2. Obj functions:

      The end-to-end tunning is to predict the ranking rather than performance.





END OF POST




Tags Cloud


Categories Cloud




It's the niceties that make the difference fate gives us the hand, and we play the cards.