Independently I was just wondering tonight why weather forecasting does not use some Machine Learning techniques as there is an abundance of training data. Wouldn't any number of regression ML algorithms find a more accurate prediction model? If Google did weather....
ML techniques certainly are being used to predict the weather, maybe just not at NOAA or the Weather Channel. The weather derivative industry[0] is an $8 billion dollar industry, making it larger than the entire rocket launch industry (public and private combined). Just as there are Elon Musks in the rocket launch industry, there are maverick hedge funds out there making a killing by predicting the weather using the latest Big Data tools.
Sometimes Machine Learning is the wrong approach. Direct mathematical modeling, if the underlying causal phenomenon are well known, bypasses the whole learning aspect of an explanatory system. Machine learning might help make sense of the residual noise that can't be accounted for with direct mathematical modeling. But at the level of sophistication that I've seen (predicting wind patterns using Finite Element Analysis methods mapped to topographical patterns fine-grained enough to model small buildings), I can't imagine any machine learning methods that could make any sense of the residual noise.
I agree that Machine Learning is not always the right approach but wouldn't it be a relatively inexpensive experiment to see if it could be used for more accurate predictions? I'm surprised that topographical patterns are mapped that precisely when ASOS and AWOs stations are so far apart. Also the weather data from commercial airplanes is so far above the surface that modeling small buildings would be meaningless.
I heard a talk from ex-Googler climate.com (now Monsanto) guys. Their forecasts were accurate enough to build a very profitable insurance business. I bet they used Bayesian rather than FEM techniques.
Is it that 'physics simulation' forecasting is best for short timescales (e.g. < 1 week in the uk), whilst ML can be better for estimating next year's weather statistics in some particular place? (Because your butterfly effect means that simulating the actual weather out to 1 year 'accurately' is not possible).
Presumably any short-term ML forecasting would use the physics simulation as it's main input, and then try to improve slightly on it (e.g. maybe you observe that in a particular small area, the rainfall is on average 10% more than the physics simulation, so your ML method would add 10%, giving you a slight forecast improvement)?
Do they also make raw sensor data available for free? I.e. if I wanted to build my own forecast system based on this code, I would still need all the data they collect with weather balloons, right?
I'd assumed that 'trials' meant 'snafu in a big software project'.
But actually it's about the butterfly effect making it impossible to get exact agreement between the old and new systems for forecasts longer than about 5 days.
You don't want the "raw sensor data". You need it after it's been processed to the appropriate level, to retrieve geophysically-relevant quantities (like temperatures and wind speeds) from the data that is measured. That is not done by the forecasters.
Fortran 90 is still the language computational stuff is made in today. I'm not convinced, however, that that choice is really for performance reasons over cultural reasons.
Still, Fortran 90 is hell of a lot better than Fortran 77, which unfortunately is the language of choice for some of the people I'm being asked to work with.
Indeed, automatic (and largely implicit) use of SIMD and OpenMP by the compiler is very very good with F90 compared to almost all other widely used languages.
I'm curious (really curious, although this might sound like a troll), have you ever tried a benchmark test on this? Things like this [0] come up every so often, and make me wonder about the efficiency advantage for coding in Fortran.
This is also relevant and insightful [1]. The top answer talks about Fortran's strict aliasing semantics.
[0] is a classic example of comparing programming languages attacking a problem with different algorithms.
I see that the author insults Fortran a couple of times, and I don't seen any indication that they tried to implement the better algorithm in Fortran. Did I miss something?
If you look at Julia[0] they have quite a lot of performance benchmarks and you will see that Fortran is always at or near the top. What is really important, from a numerical code point of view, is that with Fortran it is easy to be at the top by operating on matrices and vectors etc. You do not have to torture your mind and think too much to optimize your code, "remove the IF in the loops" is often just what you need.
I am writing code in Fortran 77/90 every day, I really really enjoy it. Calling your Fortran code from Python is also so easy that you can really have clear cut between data management in Python and computational work in Fortran.
imo, fortran90 is about the easiest language to follow - if language is all that is stopping you (and you know some other language already), then you should just pile straight in.
From looking at some of the code, what is hard for me to follow is the 'overall view'. The documentation (of the model) looks quite nicely done, but there must be a huge amount of fluid dynamics etc etc knowledge, that would take me years to learn.