Breedlove: machine learning - Convolutional neural networks: Aren't the central neurons over-represented in the output? -

Tuesday, 15 May 2012

machine learning - Convolutional neural networks: Aren't the central neurons over-represented in the output? -

[this question posed @ cross validated]

the question in short

i'm studying convolutional neural networks, , believe these networks not treat every input neuron (pixel/parameter) equivalently. imagine have deep network (many layers) applies convolution on input image. neurons in "middle" of image have many unique pathways many deeper layer neurons, means little variation in middle neurons has strong effect on output. however, neurons @ border of image have 1 way (or, depending on exact implementation, of order of 1) pathways in info flows through graph. seems these "under-represented".

i concerned this, discrimination of border neurons scales exponentially depth (number of layers) of network. adding max-pooling layer won't halt exponential increase, total connection brings neurons on equal footing. i'm not convinced reasoning correct, though, questions are:

am right effect takes place in deep convolutional networks? is there theory this, has ever been mentioned in literature? are there ways overcome effect?

because i'm not sure if gives sufficient information, i'll elaborate bit more problem statement, , why believe concern.

more detailed explanation

imagine have deep neural network takes image input. assume apply convolutional filter of 64x64 pixel on image, shift convolution window 4 pixels each time. means every neuron in input sends it's activation 16x16 = 265 neurons in layer 2. each of these neurons might send activation 265, such our topmost neuron represented in 265^2 output neurons, , on. is, however, not true neurons on edges: these might represented in little number of convolution windows, causing them activate (of order of) 1 neuron in next layer. using tricks such mirroring along edges won't help this: second-layer-neurons projected still @ edges, means that second-layer-neurons underrepresented (thus limiting importance of our border neurons well). can seen, discrepancy scales exponentially number of layers.

i have created image visualize problem, can found here (i'm not allowed include images in post itself). network has convolution window of size 3. numbers next neurons indicate number of pathways downwards deepest neuron. image reminiscent of pascal's triangle.

https://www.dropbox.com/s/7rbwv7z14j4h0jr/deep_conv_problem_stackxchange.png?dl=0

why problem?

this effect doesn't seem problem @ first sight: in principle, weights should automatically adjust in such way network it's job. moreover, edges of image not of import anyway in image recognition. effect might not noticeable in everyday image recognition tests, still concerns me because of 2 reasons: 1) generalization other applications, , 2) problems arising in case of very deep networks. 1) there might other applications, speech or sound recognition, not true middle-most neurons important. applying convolution done in field, haven't been able find papers mention effect i'm concerned with. 2) deep networks notice exponentially bad effect of discrimination of boundary neurons, means central neurons can overrepresented multiple order of magnitude (imagine have 10 layers such above illustration give 265^10 ways central neurons can project information). 1 increases number of layers, 1 bound nail limit weights cannot feasibly compensate effect. imagine perturb neurons little amount. central neurons cause output alter more several orders of magnitude, compared border neurons. believe general applications, , deep networks, ways around problem should found?

i quote sentences , below write answers.

am right effect takes place in deep convolution networks

i think wrong in general right according 64 64 sized convolution filter example. while structuring convolution layer filter sizes, never bigger looking in images. in other words - if images 200by200 , convolve 64by64 patches, these 64by64 patches larn parts or image patch identifies category. thought in first layer larn edge-like partial of import images not entire cat or auto itself.

is there theory this, has ever been mentioned in literature? , there ways overcome effect?

i never saw in paper have looked through far. , not think issue deep networks.

there no such effect. suppose first layer learned 64by64 patches in action. if there patch in top-left-most corner fired(become active) show 1 in next layers topmost left corner hence info propagated through network.

(not quoted) should not think 'a pixel beingness useful in more neurons when gets closer center'. think 64x64 filter stride of 4:

if pattern 64x64 filter in top-most-left corner of image propagated next layers top corner, otherwise there nil in next layer.

the thought maintain meaningful parts of image live while suppressing non-meaningful, dull parts, , combining these meaningful parts in next layers. in case of learning "an uppercase letter a-a" please @ images in old paper of fukushima 1980 (http://www.cs.princeton.edu/courses/archive/spr08/cos598b/readings/fukushima1980.pdf) figure 7 , 5. hence there no importance of pixel, there importance of image patch size of convolution layer.

the central neurons cause output alter more several orders of magnitude, compared border neurons. believe general applications, , deep networks, ways around problem should found?

suppose looking auto in image,

and suppose in 1st illustration auto in 64by64 top-left-most part of 200by200 image, in 2nd illustration auto in 64by64 bottom-right-most part of 200by200 image

in sec layer pixel values 0, 1st image except 1 in top-left-most corner , 2nd image except 1 in bottom-right-most corner.

now, center part of image mean nil forwards , backward propagation because values 0. corner values never discarded , effect learning weights.

machine-learning neural-network convolution

Breedlove

Tuesday, 15 May 2012

machine learning - Convolutional neural networks: Aren't the central neurons over-represented in the output? -

No comments:

Post a Comment