Wednesday, June 27, 2018

book, Atmospheric modeling, data assimilation and predictability

Atmospheric modeling, data assimilation and predictability
by Eugenia Kalnay in 2003
Because of their higher resolution, regional models have the advantage of higher accuracy and the ability to reproduce smaller-scale phenomena such as fronts, squall lines, and much better orographic forcing than global models. On the other hand, regional models have the disadvantage that,they are not “self-contained” because they require lateral boundary conditions at the borders of the horizontal domain. These boundary conditions must be as accurate as possible, because otherwise the interior solution of the regional models quickly deteriorates. Therefore it is customary to nest the regional models within another model with coarser resolution, whose forecast provides the boundary conditions. For this reason, regional models are used only for short-range forecasts.
a latitude-longitude model with a typical resolution of 1 degree and 20 vertical levels would have 360x180x20 = 1.3 M grid points. Each grid will have to carry the values of at least 4 prognostic variables (wu,wv, T, RH), and surface pressure for each column.
It is necessary to use additional information (background or first guess) to prepare initial conditions. Th model forecast is interpolated to the observation location, and if they are different, converted from model variables to observed variables y^o.
The analysis x^a is obtained by adding the correction:
Threat score TS = (P & O) /(P | O). It is also known as critical success index (CSI), as a particularly useful score for quantities that are relatively rare.
The forecasters also have access to several forecasts, and they use their judgment in assessing which one is more accurate in each case. This constitutes a major source of the “value-added” by the human forecasters.
The human forecasts are on the average significantly more skillful than the numerical guidance, but it is the improvement in NWP forecasts that drives the improvements in the subjective forecasts.
Since 1994, NCEP has been running 17 global forecasts per day, each out to 16 days, with initial perturbations obtained using the method of breeding growing dynamical perturbations in the atmosphere, which are also present in the analysis errors. The ECMWF ensemble contains 50 members.
Ensemble forecasting has 2 goals:
  1. components of the forecast that are most uncertain tend to be averaged out
  2. provide forecasters with an estimation of reliabilitylity of the forecast.
Slowing varying surface forcing, especially from the tropical ocean and from land-surface anomalies, can produce atmospheric anomalies that are longer lasting and more predictable than individual weather patterns. A most notable example is the ENSO produced by unstable oscillations of the coupled ocean-atmosphere system, with a frequency of 3-7 years. Because of their long time scale, the ENSO oscillations should be predictable a year or more in advance.

Governing equations

V. Bjerknes was a professor of applied mechanics and mathematical physics at the University of Stockholm. He elucidated the fundamental interaction between fluid dynamics and thermodynamics. In 1904, he pointed out the primitive equations which are used in climate models. Basic,ally it is 7 equations with 7 unknown variables:
  1. velocity vector (u, v, w)
  2. Temperature T
  3. pressure P
  4. Density rho: p= \rho RT
  5. water vapor mixing ratio q: \frac{dq}{dt}=E-C
It can be grouped into 3 sets of equations:
  1. conservation of mass (continuity euqation): \frac{\rho}{dt}=\nabla (\rho v)
  2. conservation of momentum (Newton 2nd law): \frac{dv}{dt}=F/m, must consider rotating frame of reference, pressure gradient force, gravitational acceleration, frictional force, Coriolis force and centrifugal force
  3. conservation of energy (thermodynamic energy equation)
Interestingly, his son is also a meteorologist, who help to pick the best date to throw the atomic bomb in Japan in 1945.

spherical coordinates

3 velocity components:
  • Zonal: along a latitudinal circle, west-east direction, u
  • Meridional: along logitudinal lines, v
  • Vertical: positive up, w
Basic wave oscillations:
  • sound
  • gravity
  • slower weather wave
they have profound implications for the present use of hydrostatic and nonhydrostatic models. Different approximations (hydrostatic, quasi-geostrophic, and the anelastic approximations) are designed to filter out some of them.
Assume the solutions have plan wave form, the specific type of wave can be determined by deriving the FDR (frequency dispersion relationship), frequency, phase speed, group velocity.
  • pure sound waves, speed = c_s = 320 m/s, propagating in any direction.
  • Lamb waves (horizontally propagating sound waves)
  • vertical gravitational oscillations
  • inertia oscillations (due to basic rotation)
  • Lamb waves in the presence of rotation and geostrophic modes. There will be 2 solutions: inertia Lamb waves and rossy waves (Coriolis force changes with latitude)
General wave solution of the perturbation equations in a resting, isothermal atmosphere.

Filtering approximations

  • Neglect the time derivative of one of the euqations of motions, we convert it from a prognostic equation into a diagnostic equation
  • Physically, we eliminate a restoring force that supports a certain type of wave
  • Most global models and some regional models use the hydrostatic approximation, whic filters sound waves.

3 numerical discretization of the equations of motion

classification of partial differential equations (PDEs):
  • wave equation(hyperbolic)
  • diffusion equation (parabolic)
  • Laplace’s or Poisson’s equations (elliptic)
well-posedness, initial and boundary conditions
  • a well-posed initial/boundary condition problem has a unique solution that depends continuously on the initial/bounary conditions
  • If too many initial/boundary conditions are specified, there will be no solution.
  • If too few are specified, the solution will not be unique.
  • If the number of initial/boundary condictions is right, but they are specified at the wrong place or time, the solution will be unique, but it will not depend smoothly on initial/boundary conditions. i.e., small errors in the initial/boundary conditions will produce huge errors in the solution.
  • We can never find a numerical solution of a problem that is ill posed: the computer will show its disgust by blowing up.
One method of solving simple SDEs is the method of separation of variables, but unfortunately in most cases it is not possible to use it, hence the need for numerical models.

3.3.2 Galerkin and spectral space representation

Spatial finite differences introduces errors in the space derivatives, resulting in a computational phase speed slower than the true phase speed, especially for short waves.
Galerkin approach uses a sum of basis functions. The basis functions are usually the eigensolutionsof the Laplace equation. For spherical coordiantes, the spherical harmonics are used.
The spatial resolution is uniform throughout the sphere. This is a major advantage over finite differences based on a latitude-longitude grid, where the convergence of the meridians at the poles requires very small time steps.

4. Introduction to the parameterization of subgrid scale physical processes

Despite the continued increase of resolution, many important processes and scales of motion in the atmosphere can not be explicitly resolved with present or future models. They include turbulent motions (0.01 m to a model grid), molecular scale (condensation, evaporation, friction and radiation)
These processes are called “sub grid-scale processes”.
to reproduce the interaction of the grid and sub grid-scale processes, the sub grid-scale phenomena are parameterized, i.e., their effect is formulated in terms of the resolved fields.

5. data assimilation

Currently, operational NWP centers produce initial conditions through a statistical combination of observations and short-range forecasts.
Spatial interpolation of obervations is not enough:
  • not enough data are available to initialize current models. Number of degrees of freedom in a modern NWP model is of the order of 10^7, but the total number of conventional observations of the variables used in the models is of the order of 10^4.
  • remote sensing data such as satellite and radar observation do not measure directly measure the model variables (wind, temperature, moisture, and surface pressure)
  • data distribution in space and time is very nonuniform. North America and Eurasia are relatively data-rich, others are much more poorly observed.
  • have a complete first guess estimate of the state of the atmosphere at all the grid points in order to generate the initial conditions. The first guess should be our best estimate of the state of the atmosphere prior to the use of the observations.
  • climatology, or a combination of climatology and a short forecast were used as a first guess.
  • As forecasts became better, the use of short-range forecast as a first guess was universally adopted in operational systems in what is called an “analysis cycle”.
3 statistical interpolation methods(3D-Var, OI(Optimal interpolation), and PSAS), have been shown to formally solve the same problem. In practice, OI requires the introduction of a number of approximations, and local solution of the analysis, grid point by grid point, or small volume by small volume.
  • optimal analysis: minimize the analysis error variance, finding the optimal weights through a least squared approach
  • variational approach, find the analysis that minimizes a cost function measuring its distance to the background and to the observations.
Ensemble Kalman filtering: All the cycles assimilate the same real obervations, but in order to maintain them realistically independent, different sets of random perturbations are added to the observations assimilated in each member of the ensemble data assimilations.
4D var. The cost function includes a term measuring the distance tothe background at the beginning of the interval, and a summation over time of the cost function for each observational increment computed with respect to the model integrated to the observation time. … 4D var seeks an initial condition such that the forecast best fits the observations within the assimilation interval. However, the fact that the 4D var method asumes a perfect model is a disadvantage since, for example, it will give the same credits to older observations at the beginning of the interval as to newer observations at the end of the interval.
quality control is based on a comparison between observations and some kind of expected value (from climatology, an average of nearby observations, or the first guess).
Collins(1998): Most common human errors have a simple structure: a single digit or a sign is wrong or missiong.

6 Atmospheric predictability and enseble forecasting

Lorenz (1993):
The initial round-off errors were the culprits; they were steadily amplifying until they dominated the solution. In today’s terminology, there was chaos. .. It soon struct me that, if the real atmosphere behaved like the simple model, long-range forecasting would be impossible.

The early hsitory of NWP

The 1st real-time, operational NWP was run in Sweden in September 1954 (to 72h at 500 hPa), half a year before the USA.
Two reasons:
  1. In 1954, the Swedes has the world’s most poweful computer, BESK.
  2. Rossby moved to Sweden.
Interestingly, Rossby was seen as a troublemaker and was not elected as the director of Swedish Meteorolgoical office. What an internal political conflict! Anyway, Rossby seek support from Military Meteorolgocial Service

Tuesday, June 26, 2018

Fastest supercomputer

Briefly speaking,
  1. supercomputer is the personal computer in the next 20 years. And it is about 10, 000 faster. This means, generally, compute is 100 faster in every 10 years. But if you pile 10, 000 cores together, does the speed surprise you? You still need to handle interconnect and heat dissipation.
  2. Fast processor speed is not enough. It depends on your application. If your codes and software, as well as other I/O in the ecosystem doesn’t follow up, your compute is just empty spinning and waste electricity.
  3. The fastest computer as in 2018 has a speed at 200 peta (=2e17, by IBM summit). I guess the application is still falling behind. Global weather prediction is still the largest demand. Now GFS and ECMWF are using upgraded computer at 8.5 peta, my workplace OU has only 0.35 peta. So it is about 5 years fall behind the fastest computer.
  4. Do you really need supercomputer? 99.99% people only need a smartphone’s speed for youtube video or WeChat communication. So don’t worry it. Focus on what you can do best and outsource the rest.
问: 为什么处理器芯片越来越便宜,房价却越来越贵?
fastest computer ranking:
日本国家高级产业科学技术研究所(NIAIST)的所长关口智嗣(Satoshi Sekiguchi)解释:“现行的超级计算机系统的运算速度一般比个人电脑快100万倍左右。”据关口表示,超级计算机在一天内的处理内容,个人计算机需要花3000年才能完成。而在体积方面,日本的这台超级计算机将占地1000平方米,所占空间相当于一个能容纳30到40辆车的停车场。
“巅峰”的运行速度令人惊讶。它能以每秒20亿亿次——或200千兆次 (200 peta)——的速度进行数学计算。像“巅峰”这样花费了2亿美元政府资金建造的超级计算机可加快电脑前沿技术的发展,比如人工智能和庞大数据处理的能力。
NOAA upgrade its supercomputer from 5.6 petaflops to 8.4 petaflogs, or 4.2 petaflogs per side. (Equivalent to 2011’s fastest)
The NWS has been using supercomputers for decades. The latest major update to the computers’ was in 2016. Currently, the combined processing power of NWS super computers is 5.78 petaflops, which is more than 10,000 times faster than the average desktop computer.
Cray XC40, 8.5 petaflogs, 130 k cores
OU Schooner: 0.35 petaflog, 10 k cores
The computer firm houses Blue, NOAA’s primary forecasting computer system, at an IBM facility in Gaithersburg, Maryland. A backup system, White, is located at a NASA site in Fairmont, West Virginia.
The two machines rank as the world’s 69th and 70th most powerful computers, according to the TOP500 List of Supercomputers.
从目前天河二号来看,计算节点的能耗约为18兆瓦,再加上散热系统的整体能耗在20兆瓦以上。如果正常运行,天河二号每年的电费就会超过1亿元,年耗电量约为2亿度。 每天耗费电费 10 万元。
“实际上,这就要求未来的超算系统能够进行体系结构,硬件、软件和制冷等多方面的创新。” 张云泉说。
而美国超级计算机的建设方一般都是使用方。在榜单上排名第三、也是美国最快的 Titan 建设方是美国能源部,主要应用于美国能源部内部的核试验模拟
中科院超算中心主任迟学斌坦言:“脱离开发利用,超算就是一堆破铜烂铁。光有高性能机器,没有人才做高水平的服务,那效果是一样的,机器过 5 年就过时了。
当年GDP少的时候,他们说GDP是衡量一个国家实力最权威的标志~ 然后中国GDP上来了,他们又说工业制造才是衡量一个国家实力的标志~ 等中国的工业制造上来了,他们又说工业污染,国家实力是靠人均。 等人均上来了,他们又说发达国家都过无污染原生态的生活~越落后的地区幸福感越强~比如那个地图上都找不到,百姓只能果腹、毫无尊严的附庸国不丹
1,凡是中国没有的,都是高精尖科技,是区分人种、素质、先进/落后的标志; 凡是中国有的,都是低技术含量的垃圾,都是偷的,大家都可以造;
2、凡是中国排名靠后的排名,都很有国际影响力,很权威、是区分人种、素质、先进/落后的标准 凡是中国排名靠前的排名,都没什么意思,发达国家都看不上眼,人家就喜欢小、慢、等着老百姓的良心~

Seymour Cray

father of supercomputer
Cray did not enjoy working on such ‘mundane’ machines, constrained to design for low-cost construction, so CDC could sell lots of them. His desire was to “produce the largest [fastest] computer in the world”.
Unlike most high-end projects, Cray realized that there was considerably more to performance than simple processor speed, that I/O bandwidth had to be maximized as well in order to avoid “starving” the processor of data to crunch. He later noted, “Anyone can build a fast CPU. The trick is to build a fast system.
During this period Cray had become increasingly annoyed at what he saw as interference from CDC management. Cray always demanded an absolutely quiet work environment with a minimum of management overhead, but as the company grew he found himself constantly interrupted by middle managers who — according to Cray — did little but gawk and use him as a sales tool by introducing him to prospective customers.
Cray decided that in order to continue development he would have to move from St. Paul, far enough that it would be too long a drive for a “quick visit” and long distance telephone charges would be just enough to deter most calls, yet close enough that real visits or board meetings could be attended without too much difficulty. After some debate, Norris backed him and set up a new laboratory on land Cray owned in his hometown of Chippewa Falls.
Cray avoided publicity, and there are a number of unusual tales about his life away from work (termed “Rollwagenisms”, from then-CEO of Cray Research, John A. Rollwagen). He enjoyed skiing, windsurfing, tennis, and other sports. Another favorite pastime was digging a tunnel under his home; he attributed the secret of his success to “visits by elves“ while he worked in the tunnel: “While I’m digging in the tunnel, the elves will often come to me with solutions to my problem.”


Pat Gelsinger, head of the Digital Enterprise Division at Intel, says that Moore’s Law will continue to apply for the next few years. In his keynote address at the Intel Developer Forumin Shanghai, he said that the performance of supercomputers would be measured in zettaflops (10 to the 21st power floating-point operations) per second by around 2029. With that power, he said it would be possible to make weather forecasts that would be sufficiently accurate for 14 days.
One petaflop (10[sup]15[/sup] power floating-point operations) per second would allow real time analysis of images taken by magnetic resonance scanners. Current systems need around two hours for such analyses.