Neural Networks For Periodic Features | by Dr. Robert Kübler

Machine Learning

Neural Networks For Periodic Features | by Dr. Robert Kübler | Jan, 2024

hhhhm

2024年1月17日

Neural Networks For Periodic Features | by Dr. Robert Kübler | Jan, 2024

[ad_1]

When ReLU’s extrapolation capabilities should not sufficient

Photograph by Willian Justen de Vasconcellos on Unsplash

Neural networks are identified to be nice approximators for any operate — at the very least at any time when we don’t transfer too far-off from our dataset. Allow us to see what meaning. Right here is a few information:

It doesn’t solely appear to be a sine wave, it really is, with some noise added. We are able to now practice a traditional feed-forward neural community having 1 hidden layer with 1000 neurons and ReLU activation. We get the next match:

It seems fairly respectable, aside from the perimeters. We might repair this by including extra neurons to the hidden layer in response to Cybenko’s common approximation theorem. However I wish to level you one thing else:

We might argue now that this extrapolation conduct is dangerous if we assume the wave sample to proceed exterior of the noticed vary. But when there isn’t a area information or extra information we are able to resort to, it might simply be this: an assumption.

Nonetheless, within the the rest of this text, we will assume that any periodic sample we are able to decide up inside the information continues exterior as properly. It is a frequent assumption when doing time collection modeling, the place we naturally wish to extrapolate into the longer term. We assume that any noticed seasonality within the coaching information will simply proceed like that, as a result of what else can we are saying with none extra info? On this article, I wish to present you the way utilizing sine-based activation capabilities helps bake this assumption into the mannequin.

However earlier than we go there, allow us to shortly dive deeper into how ReLU-based neural networks extrapolate typically, and why we must always not use them for time collection forecasting as is.

[ad_2]