Wednesday, June 27, 2018

book, Atmospheric modeling, data assimilation and predictability

Atmospheric modeling, data assimilation and predictability

by Eugenia Kalnay in 2003

Because of their higher resolution, regional models have the advantage of higher accuracy and the ability to reproduce smaller-scale phenomena such as fronts, squall lines, and much better orographic forcing than global models. On the other hand, regional models have the disadvantage that,they are not “self-contained” because they require lateral boundary conditions at the borders of the horizontal domain. These boundary conditions must be as accurate as possible, because otherwise the interior solution of the regional models quickly deteriorates. Therefore it is customary to nest the regional models within another model with coarser resolution, whose forecast provides the boundary conditions. For this reason, regional models are used only for short-range forecasts.

a latitude-longitude model with a typical resolution of 1 degree and 20 vertical levels would have 360x180x20 = 1.3 M grid points. Each grid will have to carry the values of at least 4 prognostic variables (wu,wv, T, RH), and surface pressure for each column.

It is necessary to use additional information (background or first guess) to prepare initial conditions. Th model forecast is interpolated to the observation location, and if they are different, converted from model variables to observed variables y^o.

The analysis x^a is obtained by adding the correction:

$x^a=x^b+W[y^a-H(x^b)]$

Threat score TS = (P & O) /(P | O). It is also known as critical success index (CSI), as a particularly useful score for quantities that are relatively rare.

The forecasters also have access to several forecasts, and they use their judgment in assessing which one is more accurate in each case. This constitutes a major source of the “value-added” by the human forecasters.

The human forecasts are on the average significantly more skillful than the numerical guidance, but it is the improvement in NWP forecasts that drives the improvements in the subjective forecasts.

Since 1994, NCEP has been running 17 global forecasts per day, each out to 16 days, with initial perturbations obtained using the method of breeding growing dynamical perturbations in the atmosphere, which are also present in the analysis errors. The ECMWF ensemble contains 50 members.

Ensemble forecasting has 2 goals:

components of the forecast that are most uncertain tend to be averaged out
provide forecasters with an estimation of reliabilitylity of the forecast.

Slowing varying surface forcing, especially from the tropical ocean and from land-surface anomalies, can produce atmospheric anomalies that are longer lasting and more predictable than individual weather patterns. A most notable example is the ENSO produced by unstable oscillations of the coupled ocean-atmosphere system, with a frequency of 3-7 years. Because of their long time scale, the ENSO oscillations should be predictable a year or more in advance.

Governing equations

V. Bjerknes was a professor of applied mechanics and mathematical physics at the University of Stockholm. He elucidated the fundamental interaction between fluid dynamics and thermodynamics. In 1904, he pointed out the primitive equations which are used in climate models. Basic,ally it is 7 equations with 7 unknown variables:

velocity vector (u, v, w)
Temperature T
pressure P
Density rho: $p= \rho RT$
water vapor mixing ratio q: $\frac{dq}{dt}=E-C$

It can be grouped into 3 sets of equations:

conservation of mass (continuity euqation): $\frac{\rho}{dt}=\nabla (\rho v)$
conservation of momentum (Newton 2nd law): $\frac{dv}{dt}=F/m$ , must consider rotating frame of reference, pressure gradient force, gravitational acceleration, frictional force, Coriolis force and centrifugal force
conservation of energy (thermodynamic energy equation)

Interestingly, his son is also a meteorologist, who help to pick the best date to throw the atomic bomb in Japan in 1945.

spherical coordinates

3 velocity components:

Zonal: along a latitudinal circle, west-east direction, u
Meridional: along logitudinal lines, v
Vertical: positive up, w

Basic wave oscillations:

sound
gravity
slower weather wave

they have profound implications for the present use of hydrostatic and nonhydrostatic models. Different approximations (hydrostatic, quasi-geostrophic, and the anelastic approximations) are designed to filter out some of them.

Assume the solutions have plan wave form, the specific type of wave can be determined by deriving the FDR (frequency dispersion relationship), frequency, phase speed, group velocity.

pure sound waves, speed = c_s = 320 m/s, propagating in any direction.
Lamb waves (horizontally propagating sound waves)
vertical gravitational oscillations
inertia oscillations (due to basic rotation)
Lamb waves in the presence of rotation and geostrophic modes. There will be 2 solutions: inertia Lamb waves and rossy waves (Coriolis force changes with latitude)

General wave solution of the perturbation equations in a resting, isothermal atmosphere.

Filtering approximations

Neglect the time derivative of one of the euqations of motions, we convert it from a prognostic equation into a diagnostic equation
Physically, we eliminate a restoring force that supports a certain type of wave
Most global models and some regional models use the hydrostatic approximation, whic filters sound waves.

3 numerical discretization of the equations of motion

classification of partial differential equations (PDEs):

wave equation(hyperbolic)
diffusion equation (parabolic)
Laplace’s or Poisson’s equations (elliptic)

well-posedness, initial and boundary conditions

a well-posed initial/boundary condition problem has a unique solution that depends continuously on the initial/bounary conditions
If too many initial/boundary conditions are specified, there will be no solution.
If too few are specified, the solution will not be unique.
If the number of initial/boundary condictions is right, but they are specified at the wrong place or time, the solution will be unique, but it will not depend smoothly on initial/boundary conditions. i.e., small errors in the initial/boundary conditions will produce huge errors in the solution.
We can never find a numerical solution of a problem that is ill posed: the computer will show its disgust by blowing up.

One method of solving simple SDEs is the method of separation of variables, but unfortunately in most cases it is not possible to use it, hence the need for numerical models.

3.3.2 Galerkin and spectral space representation

p94

Spatial finite differences introduces errors in the space derivatives, resulting in a computational phase speed slower than the true phase speed, especially for short waves.

Galerkin approach uses a sum of basis functions. The basis functions are usually the eigensolutionsof the Laplace equation. For spherical coordiantes, the spherical harmonics are used.

The spatial resolution is uniform throughout the sphere. This is a major advantage over finite differences based on a latitude-longitude grid, where the convergence of the meridians at the poles requires very small time steps.

4. Introduction to the parameterization of subgrid scale physical processes

Despite the continued increase of resolution, many important processes and scales of motion in the atmosphere can not be explicitly resolved with present or future models. They include turbulent motions (0.01 m to a model grid), molecular scale (condensation, evaporation, friction and radiation)

These processes are called “sub grid-scale processes”.

to reproduce the interaction of the grid and sub grid-scale processes, the sub grid-scale phenomena are parameterized, i.e., their effect is formulated in terms of the resolved fields.

5. data assimilation

Currently, operational NWP centers produce initial conditions through a statistical combination of observations and short-range forecasts.

Spatial interpolation of obervations is not enough:

not enough data are available to initialize current models. Number of degrees of freedom in a modern NWP model is of the order of 10^7, but the total number of conventional observations of the variables used in the models is of the order of 10^4.
remote sensing data such as satellite and radar observation do not measure directly measure the model variables (wind, temperature, moisture, and surface pressure)
data distribution in space and time is very nonuniform. North America and Eurasia are relatively data-rich, others are much more poorly observed.

Solution:

have a complete first guess estimate of the state of the atmosphere at all the grid points in order to generate the initial conditions. The first guess should be our best estimate of the state of the atmosphere prior to the use of the observations.
climatology, or a combination of climatology and a short forecast were used as a first guess.
As forecasts became better, the use of short-range forecast as a first guess was universally adopted in operational systems in what is called an “analysis cycle”.

3 statistical interpolation methods(3D-Var, OI(Optimal interpolation), and PSAS), have been shown to formally solve the same problem. In practice, OI requires the introduction of a number of approximations, and local solution of the analysis, grid point by grid point, or small volume by small volume.

optimal analysis: minimize the analysis error variance, finding the optimal weights through a least squared approach
variational approach, find the analysis that minimizes a cost function measuring its distance to the background and to the observations.

Ensemble Kalman filtering: All the cycles assimilate the same real obervations, but in order to maintain them realistically independent, different sets of random perturbations are added to the observations assimilated in each member of the ensemble data assimilations.

4D var. The cost function includes a term measuring the distance tothe background at the beginning of the interval, and a summation over time of the cost function for each observational increment computed with respect to the model integrated to the observation time. … 4D var seeks an initial condition such that the forecast best fits the observations within the assimilation interval. However, the fact that the 4D var method asumes a perfect model is a disadvantage since, for example, it will give the same credits to older observations at the beginning of the interval as to newer observations at the end of the interval.

quality control is based on a comparison between observations and some kind of expected value (from climatology, an average of nearby observations, or the first guess).

Collins(1998): Most common human errors have a simple structure: a single digit or a sign is wrong or missiong.

6 Atmospheric predictability and enseble forecasting

Lorenz (1993):

The initial round-off errors were the culprits; they were steadily amplifying until they dominated the solution. In today’s terminology, there was chaos. .. It soon struct me that, if the real atmosphere behaved like the simple model, long-range forecasting would be impossible.

The early hsitory of NWP

The 1st real-time, operational NWP was run in Sweden in September 1954 (to 72h at 500 hPa), half a year before the USA.

Two reasons:

In 1954, the Swedes has the world’s most poweful computer, BESK.
Rossby moved to Sweden.

Interestingly, Rossby was seen as a troublemaker and was not elected as the director of Swedish Meteorolgoical office. What an internal political conflict! Anyway, Rossby seek support from Military Meteorolgocial Service

Tuesday, June 26, 2018

Fastest supercomputer

Briefly speaking，

supercomputer is the personal computer in the next 20 years. And it is about 10, 000 faster. This means, generally, compute is 100 faster in every 10 years. But if you pile 10, 000 cores together, does the speed surprise you? You still need to handle interconnect and heat dissipation.
Fast processor speed is not enough. It depends on your application. If your codes and software, as well as other I/O in the ecosystem doesn’t follow up, your compute is just empty spinning and waste electricity.
The fastest computer as in 2018 has a speed at 200 peta (=2e17, by IBM summit). I guess the application is still falling behind. Global weather prediction is still the largest demand. Now GFS and ECMWF are using upgraded computer at 8.5 peta, my workplace OU has only 0.35 peta. So it is about 5 years fall behind the fastest computer.
Do you really need supercomputer? 99.99% people only need a smartphone’s speed for youtube video or WeChat communication. So don’t worry it. Focus on what you can do best and outsource the rest.

北京，过去15年房价增长13.2倍，年化涨幅是18.77%。(1+18.77%)^15=13.2

问：为什么处理器芯片越来越便宜，房价却越来越贵？

答：因为半导体工艺的进步，可以将晶体管做得越来越小，越来越密集。而人类的体型基本是不变的，却有越来越多的人涌向大城市，房价只是对这种涌入的一种壁垒，通过抬升居住成本引流，本质上是一种市场机制。每个人都向往大城市的生活，然而不是每个人都负担得起成本罢了。

fastest computer ranking: https://www.top500.org/

日本正着力研发世界上运行速度最快的超级计算机，以便为人工智能领域的研究搭建一个平台。CNN最近的报道披露，日本预计在2018年4月份之前完成这个名为“人工智能桥接云基础设施”（ABCI）的超级计算机的研发计划。

日本国家高级产业科学技术研究所（NIAIST）的所长关口智嗣（Satoshi Sekiguchi）解释：“现行的超级计算机系统的运算速度一般比个人电脑快100万倍左右。”据关口表示，超级计算机在一天内的处理内容，个人计算机需要花3000年才能完成。而在体积方面，日本的这台超级计算机将占地1000平方米，所占空间相当于一个能容纳30到40辆车的停车场。

关口表示，ABCI的研制算是为日本制造商建立一个研究平台，帮助它们开发和改进无人驾驶汽车、机器人和医疗诊断服务等。“超级计算机对于促进人工智能领域的发展来说，是一个极其重要的工具。”

而中国的“神威•太湖之光”计算机目前被用于天气预报、制药研究和工业设计等领域。

2018.6

“巅峰”的运行速度令人惊讶。它能以每秒20亿亿次——或200千兆次 (200 peta)——的速度进行数学计算。像“巅峰”这样花费了2亿美元政府资金建造的超级计算机可加快电脑前沿技术的发展，比如人工智能和庞大数据处理的能力。

“巅峰”由数排黑色的、冰箱大小的单元组成，总重量达340吨，配置在一间9250平方英尺（约合859平方米）的房内。由IBM的9216个中央处理芯片和另一家美国科技公司Nvidia的2.7648万个图像处理器提供运算力，它们被185英里（约合298千米）的光缆联系在一起。

“巅峰”的冷却需要每分钟4000加仑水（约合1.5万升），而这台超级计算机所消耗的电力足以为8100户美国家庭提供照明。

2018

NOAA upgrade its supercomputer from 5.6 petaflops to 8.4 petaflogs, or 4.2 petaflogs per side. (Equivalent to 2011’s fastest)

The NWS has been using supercomputers for decades. The latest major update to the computers’ was in 2016. Currently, the combined processing power of NWS super computers is 5.78 petaflops, which is more than 10,000 times faster than the average desktop computer.

ECMWF

Cray XC40, 8.5 petaflogs, 130 k cores

OU Schooner: 0.35 petaflog, 10 k cores

2005.8.29

The computer firm houses Blue, NOAA’s primary forecasting computer system, at an IBM facility in Gaithersburg, Maryland. A backup system, White, is located at a NASA site in Fairmont, West Virginia.

The two machines rank as the world’s 69th and 70th most powerful computers, according to the TOP500 List of Supercomputers.

2017.6

除了助力探月工程、载人航天等政府科研项目外，天河二号目前已经逐渐应用于民用领域，比如石油勘探、汽车飞机的设计制造、基因测序等。

天河二号还可应用于娱乐产业，现在通过超级计算机制作动漫和3D电影已经成为潮流。电影《阿凡达》动漫渲染制作耗时一年多完成。如果用天河二号，仅用1个月就可制作出与《阿凡达》动染效果相当的影片。

中国商用飞机设计有限公司北京研究中心利用约2.4万个CPU核开展了大型民机全参数气动优化设计，在天河二号计算6天，完成了其自身计算平台约需2年的工作量，极大地提高了优化工作效率。

华大基因互联网支撑与发展中心负责人说，天河二号具有强大的计算能力，以500人规模的全基因组信息关联性分析为例，华大基因利用原有计算机需1年时间，利用天河二号只需3个小时。华大基因是天河一号和天河二号的大商业客户。

从目前天河二号来看，计算节点的能耗约为18兆瓦，再加上散热系统的整体能耗在20兆瓦以上。如果正常运行，天河二号每年的电费就会超过1亿元，年耗电量约为2亿度。 每天耗费电费 10 万元。

“泰坦”和“红杉”的运算能耗比分别是1.95千万亿次/秒兆瓦和2.17千万亿次/秒兆瓦，运算能耗比略高于天河二号。

“实际上，这就要求未来的超算系统能够进行体系结构，硬件、软件和制冷等多方面的创新。” 张云泉说。

上海超算中心主任奚自立在2012年曾表示，上海超算中心拥有200万亿次计算能力，但是只有20-30%运算任务能够扩展到10万亿次，有20-30%的计算任务能够利用的计算力低于2万亿次。很多计算资源由于应用的问题实际上用不到，造成一定的浪费。

通过查阅Linpack的排名表可以发现，入围世界前500名的中国超级计算机一共有41部，其中有40部使用了英特尔的处理器，还有一部采用了AMD的处理器。另一边厢，我们的邻国日本富士通公司(FUJITSU)正在建造的一台超级计算机，则完全依靠日本自己的技术。

这个时代出现了很多平民化的超级计算机，譬如用浩鑫HTPC准系统凑起来的超级计算机，把一个学校的MAC电脑凑起来的超级计算机等等，这些看似玩具的东西居然一度占据了TOP500超级计算机排行榜，甚至谷歌自己用的服务器也是用这种办法攒出来的。而在这个过程中，人们发现，限制超级计算机能力居然是功耗，人们不能堆积太多的数量是因为功率和发热限制，性能功耗比甚至比性能本身更重要。

周边硬件和软件的落后才是大型机的应用瓶颈。因此，真正令人欣慰的新闻，不应该仅仅是计算速度这一单项指标“世界第一”，而是我们的软件水平提高使得原有大型机的应用深度和领域进一步扩大。但是，在我们印象中，自1978年以来我们听到的关于计算机的新闻，都是计算速度的提高。这是因为软件上的成就，不容易介绍给外行（领导）；计算速度概念简单易懂，而且还有国际组织排名，外行容易认识到这一成就。

这就好像武器一样，在和平时代，说出来够厉害，那只是面子问题；好用好卖，那才是实惠。

而美国超级计算机的建设方一般都是使用方。在榜单上排名第三、也是美国最快的 Titan 建设方是美国能源部，主要应用于美国能源部内部的核试验模拟。

定位不同，建设方式自然就不同。国外超级计算机建设方一般是先有计算量需求，根据所需计算量设计系统，根据需求设计超级计算机的架构方式。中国是先进行建设，尽力提高建能，尝试满足更高的计算需求。

中科院超算中心主任迟学斌坦言：“脱离开发利用，超算就是一堆破铜烂铁。光有高性能机器，没有人才做高水平的服务，那效果是一样的，机器过 5 年就过时了。

当年GDP少的时候，他们说GDP是衡量一个国家实力最权威的标志~ 然后中国GDP上来了，他们又说工业制造才是衡量一个国家实力的标志~ 等中国的工业制造上来了，他们又说工业污染，国家实力是靠人均。等人均上来了，他们又说发达国家都过无污染原生态的生活~越落后的地区幸福感越强~比如那个地图上都找不到，百姓只能果腹、毫无尊严的附庸国不丹

两个很有意思的现象：

1，凡是中国没有的，都是高精尖科技，是区分人种、素质、先进/落后的标志；凡是中国有的,都是低技术含量的垃圾，都是偷的，大家都可以造；

2、凡是中国排名靠后的排名，都很有国际影响力，很权威、是区分人种、素质、先进/落后的标准凡是中国排名靠前的排名，都没什么意思，发达国家都看不上眼，人家就喜欢小、慢、等着老百姓的良心~

Seymour Cray

https://wiki2.org/en/Seymour_Cray

father of supercomputer

Cray did not enjoy working on such ‘mundane’ machines, constrained to design for low-cost construction, so CDC could sell lots of them. His desire was to “produce the largest [fastest] computer in the world”.

Unlike most high-end projects, Cray realized that there was considerably more to performance than simple processor speed, that I/O bandwidth had to be maximized as well in order to avoid “starving” the processor of data to crunch. He later noted, “Anyone can build a fast CPU. The trick is to build a fast system.“

During this period Cray had become increasingly annoyed at what he saw as interference from CDC management. Cray always demanded an absolutely quiet work environment with a minimum of management overhead, but as the company grew he found himself constantly interrupted by middle managers who — according to Cray — did little but gawk and use him as a sales tool by introducing him to prospective customers.

Cray decided that in order to continue development he would have to move from St. Paul, far enough that it would be too long a drive for a “quick visit” and long distance telephone charges would be just enough to deter most calls, yet close enough that real visits or board meetings could be attended without too much difficulty. After some debate, Norris backed him and set up a new laboratory on land Cray owned in his hometown of Chippewa Falls.

Cray avoided publicity, and there are a number of unusual tales about his life away from work (termed “Rollwagenisms”, from then-CEO of Cray Research, John A. Rollwagen). He enjoyed skiing, windsurfing, tennis, and other sports. Another favorite pastime was digging a tunnel under his home; he attributed the secret of his success to “visits by elves“ while he worked in the tunnel: “While I’m digging in the tunnel, the elves will often come to me with solutions to my problem.”

application

Pat Gelsinger, head of the Digital Enterprise Division at Intel, says that Moore’s Law will continue to apply for the next few years. In his keynote address at the Intel Developer Forumin Shanghai, he said that the performance of supercomputers would be measured in zettaflops (10 to the 21st power floating-point operations) per second by around 2029. With that power, he said it would be possible to make weather forecasts that would be sufficiently accurate for 14 days.

One petaflop (10[sup]15[/sup] power floating-point operations) per second would allow real time analysis of images taken by magnetic resonance scanners. Current systems need around two hours for such analyses.