Wednesday 15 August 2012

machine learning - why pretraining for convolutional neural networks -



machine learning - why pretraining for convolutional neural networks -

usually propagation nn has problem of vanishing gradients. found convolutional nn (cnn) how rid of vanishing gradient problems (why?).

also in papers pretraining approaches have been discussed cnn. explain me following?

(1) resons pretraining in cnn , (2) problems/limitations cnn? (3) relavent papers talking limitation of cnn?

thanks in advance.

pretraining regularization technique. improves generalization accuracy of model. since network exposed big amount of info (we have vast amount of unsupervised info in many taks), weight parameters carried space more represent info distribution in overall rather overfitting specific subset of underlying info distribution. neural nets, high model representation capacity tons of hidden units, tend overfit data, , vulnerable random parameter initializations. also, initial layers initialized in supervised way, gradient dilution problem not severe anymore. why pretraining used initial step supervised task carried gradient descent algorithm.

cnns share same fate other neural nets. there many parameters tune; optimal input patch size, number of hidden layers, number of feature maps per layer, pooling , stride sizes, normalization windows, learning rate , others. thus, problem of model selection relatively harder compared other ml techniques. training of big networks either carried on gpus or cluster of cpus.

machine-learning computer-vision neural-network

No comments:

Post a Comment