meteva

提供气象产品检验相关python程序


FSS

<p>[TOC]</p> <pre><code class="language-python">%matplotlib inline %load_ext autoreload %autoreload 2 import meteva.base as meb import meteva.product as mpd import meteva.method as mem import datetime import meteva.perspact as mps # 透视分析模块</code></pre> <pre><code>The autoreload extension is already loaded. To reload it, use: %reload_ext autoreload</code></pre> <pre><code class="language-python">grade_list = [5,10] half_window_size_list = [5,10,15,20,25] path_ob = r'H:\test_data\input\mem\mode\ob\rain03\20070111.000.nc' path_fo = r'H:\test_data\input\mem\mode\ec\rain03\20070108.003.nc' grd_ob1 = meb.read_griddata_from_nc(path_ob,time = &amp;quot;2020070111&amp;quot;,dtime = 0,level =0,data_name = &amp;quot;ob&amp;quot;) grd_fo1 = meb.read_griddata_from_nc(path_fo,time = &amp;quot;2020070108&amp;quot;,dtime = 3,level =0,data_name = &amp;quot;ECMWF&amp;quot;) meb.contourf_2d_grid(grd_ob1,cmap = &amp;quot;rain_3h&amp;quot;) meb.contourf_2d_grid(grd_fo1,cmap = &amp;quot;rain_3h&amp;quot;)</code></pre> <p><img src="https://www.showdoc.com.cn/server/api/attachment/visitfile/sign/0a1cf443fb8d1ef631ff4e94071dbee8" alt="" /></p> <p><img src="https://www.showdoc.com.cn/server/api/attachment/visitfile/sign/b200d7d18f8abc898d063d078283109e" alt="" /></p> <h1>统计fss中间结果</h1> <p>&lt;font face=&quot;黑体&quot; color=Blue size=3&gt;<strong>fbs_pobfo(ob_xy, fo_xy,grade_list=[1e-30],half_window_size_list=[1],compare = &quot;&gt;=&quot;, masker=None)</strong>&lt;/font&gt;<br /> 输入numpy形式的2维网格数据,计算fss评分相关的中间结果,计算步骤包括:</p> <ol> <li>设置1个阈值,将观测(预报)大于等于阈值的部分设置为1,其它部分设为0 </li> <li>对网格数据进行滑动平均,滑动窗口大小为half_window_size * 2 + 1,所得结果记为窗口概率</li> <li>将平滑后结果,乘以masker_xy </li> <li>计算上一步之后的窗口概率观测值的平方的均值,窗口概率预报值的平方的均值,窗口概率误差(预报-观测)的平方的均值 </li> <li>循环不同的阈值和窗口大小,重复上述步骤</li> </ol> <table> <thead> <tr> <th style="text-align: left;">参数</th> <th style="text-align: center;">说明</th> <th style="text-align: left;">备注</th> </tr> </thead> <tbody> <tr> <td style="text-align: left;"><strong>&lt;font face=&quot;黑体&quot; color=Blue size=5&gt;ob_xy&lt;/font&gt;</strong></td> <td style="text-align: center;">一个观测平面场数据</td> <td style="text-align: left;">numpy数组</td> </tr> <tr> <td style="text-align: left;"><strong>&lt;font face=&quot;黑体&quot; color=Blue size=5&gt;grd_fo&lt;/font&gt;</strong></td> <td style="text-align: center;">一个预报平面场数据</td> <td style="text-align: left;">numpy数组,shape和观测一致</td> </tr> <tr> <td style="text-align: left;"><strong>grade_list</strong></td> <td style="text-align: center;">该参数用于对连续变量做多种等级阈值的二分类检验,其中包含多个事件是否记录为发生的判断阈值,记其中一个阈值为g,则判断为事件发生的条件是要素值 &gt;= g。该参数缺省时列表中只包含一个取值为1e-30的阈值,由于气象要素精度通常比该缺省值大,因此它相当于将 &gt;0 作为事件发生的判据</td> </tr> <tr> <td style="text-align: left;"><strong>half_window_size</strong></td> <td style="text-align: center;">半边窗口的网格数据,例如当half_window_size = 1,时,计算区域中部的一个网格点的平均用到 3×3个格点,当half_window_size = 2时 用到了5×5个格点求平均, 在区域的边缘,计算平均的格点数为窗口内实际格点数,每个位置会略有差异,但会正确保持平均值的含义</td> </tr> <tr> <td style="text-align: left;"><strong>compare</strong></td> <td style="text-align: center;">比较方法,可选项包括&quot;&gt;=&quot;,&quot;&gt;&quot;,&quot;&lt;=&quot;,&quot;&lt;&quot;, 是要素原始值和阈值对比的方法,默认方法为&quot;&gt;=&quot;,即原始值大于等于阈值记为1,否则记为0,当compare为&quot;&lt;=&quot;时,原始值小于等于阈值的站点记为1, 这在基于能见度标记大雾事件、低温事件或降温事件等场景中可以用到</td> </tr> <tr> <td style="text-align: left;"><strong>masker</strong></td> <td style="text-align: center;">用于裁剪部分区域的mask数据</td> <td style="text-align: left;">numpy数据,shape和观测一致,其中只有0和1两种值,取值为1的部分是需要在计算中保留的部分</td> </tr> <tr> <td style="text-align: left;">&lt;font face=&quot;黑体&quot; color=blue size=5&gt;return&lt;/font&gt;</td> <td style="text-align: center;">返回3维数组,shape = (窗口种类数,等级数,3),其中result[...,0],result[...,1],result[...,2]分别是概率观测值的平方的均值,窗口概率预报值的平方的均值,窗口概率误差(预报-观测)的平方的均值</td> </tr> </tbody> </table> <p>参数示例</p> <pre><code class="language-python">ob_xy = grd_ob1.values.squeeze() fo_xy = grd_fo1.values.squeeze() fss_middle = mem.fbs_pobfo(ob_xy,fo_xy,grade_list=grade_list,half_window_size_list=half_window_size_list) print(fss_middle)</code></pre> <pre><code>[[[0.01026934 0.00910105 0.00344057] [0.00686192 0.00180271 0.00339929]] [[0.0064876 0.00541789 0.00129035] [0.00412846 0.00095178 0.00157669]] [[0.00455968 0.00382494 0.00072619] [0.00280346 0.00061937 0.00098296]] [[0.00345356 0.00297099 0.00047066] [0.00206549 0.00044611 0.00070138]] [[0.002774 0.00246103 0.00032273] [0.00161626 0.0003422 0.00053346]]]</code></pre> <h1>FSS</h1> <p>&lt;font face=&quot;黑体&quot; color=Blue size=3&gt;<strong>fss(grd_ob,grd_fo,grade_list=[1e-30],half_window_size_list=[1],compare = &quot;&gt;=&quot;, masker=None)</strong>&lt;/font&gt;<br /> 输入grid_data数据集,计算fss评分以及相关中间结果,计算步骤包括:</p> <ol> <li>调用fbs_pobfo函数计算中间结果</li> <li>计算fss, fss = 1 - 窗口概率误差(预报-观测)的平方的均值/(窗口概率观测值的平方的均值+窗口概率预报值的平方的均值)</li> <li>为了便于后续批量分析将结果重新组合成stadata形式</li> </ol> <table> <thead> <tr> <th style="text-align: left;">参数</th> <th style="text-align: center;">说明</th> <th style="text-align: left;">备注</th> </tr> </thead> <tbody> <tr> <td style="text-align: left;"><strong>&lt;font face=&quot;黑体&quot; color=Blue size=5&gt;grd_ob&lt;/font&gt;</strong></td> <td style="text-align: center;">一个观测平面场数据</td> <td style="text-align: left;">grid_data格式的网格数据,时效维度取值必须是0</td> </tr> <tr> <td style="text-align: left;"><strong>&lt;font face=&quot;黑体&quot; color=Blue size=5&gt;grd_fo&lt;/font&gt;</strong></td> <td style="text-align: center;">一个预报平面场数据</td> <td style="text-align: left;">ngrid_data格式的网格数据</td> </tr> <tr> <td style="text-align: left;"><strong>grade_list</strong></td> <td style="text-align: center;">该参数用于对连续变量做多种等级阈值的二分类检验,其中包含多个事件是否记录为发生的判断阈值,记其中一个阈值为g,则判断为事件发生的条件是要素值 &gt;= g。该参数缺省时列表中只包含一个取值为1e-30的阈值,由于气象要素精度通常比该缺省值大,因此它相当于将 &gt;0 作为事件发生的判据</td> </tr> <tr> <td style="text-align: left;"><strong>half_window_size</strong></td> <td style="text-align: center;">半边窗口的网格数据,例如当half_window_size = 1,时,计算区域中部的一个网格点的平均用到 3×3个格点,当half_window_size = 2时 用到了5×5个格点求平均, 在区域的边缘,计算平均的格点数为窗口内实际格点数,每个位置会略有差异,但会正确保持平均值的含义</td> </tr> <tr> <td style="text-align: left;"><strong>compare</strong></td> <td style="text-align: center;">比较方法,可选项包括&quot;&gt;=&quot;,&quot;&gt;&quot;,&quot;&lt;=&quot;,&quot;&lt;&quot;, 是要素原始值和阈值对比的方法,默认方法为&quot;&gt;=&quot;,即原始值大于等于阈值记为1,否则记为0,当compare为&quot;&lt;=&quot;时,原始值小于等于阈值的站点记为1, 这在基于能见度标记大雾事件、低温事件或降温事件等场景中可以用到</td> </tr> <tr> <td style="text-align: left;"><strong>masker</strong></td> <td style="text-align: center;">用于裁剪部分区域的mask数据</td> <td style="text-align: left;">numpy数据,shape和观测一致,其中只有0和1两种值,取值为1的部分是需要在计算中保留的部分</td> </tr> <tr> <td style="text-align: left;">&lt;font face=&quot;黑体&quot; color=blue size=5&gt;return&lt;/font&gt;</td> <td style="text-align: center;">返回3维数组,shape = (窗口种类数,等级数,3),其中result[...,0],result[...,1],result[...,2]分别是概率观测值的平方的均值,窗口概率预报值的平方的均值,窗口概率误差(预报-观测)的平方的均值</td> </tr> </tbody> </table> <p>参数示例</p> <pre><code class="language-python">fss_one = mem.fss(grd_ob1,grd_fo1,grade_list=grade_list,half_window_size_list = half_window_size_list) print(fss_one)</code></pre> <pre><code> level time dtime id lon lat fname \ 0 NaN 2020-07-01 08:00:00 3 999999 NaN NaN ECMWF 1 NaN 2020-07-01 08:00:00 3 999999 NaN NaN ECMWF 2 NaN 2020-07-01 08:00:00 3 999999 NaN NaN ECMWF 3 NaN 2020-07-01 08:00:00 3 999999 NaN NaN ECMWF 4 NaN 2020-07-01 08:00:00 3 999999 NaN NaN ECMWF 5 NaN 2020-07-01 08:00:00 3 999999 NaN NaN ECMWF 6 NaN 2020-07-01 08:00:00 3 999999 NaN NaN ECMWF 7 NaN 2020-07-01 08:00:00 3 999999 NaN NaN ECMWF 8 NaN 2020-07-01 08:00:00 3 999999 NaN NaN ECMWF 9 NaN 2020-07-01 08:00:00 3 999999 NaN NaN ECMWF half_window_size grade pob pfo fbs fss 0 5 5 0.010269 0.009101 0.003441 0.822380 1 5 10 0.006862 0.001803 0.003399 0.607682 2 10 5 0.006488 0.005418 0.001290 0.891618 3 10 10 0.004128 0.000952 0.001577 0.689642 4 15 5 0.004560 0.003825 0.000726 0.913391 5 15 10 0.002803 0.000619 0.000983 0.712824 6 20 5 0.003454 0.002971 0.000471 0.926740 7 20 10 0.002065 0.000446 0.000701 0.720742 8 25 5 0.002774 0.002461 0.000323 0.938351 9 25 10 0.001616 0.000342 0.000533 0.727612 </code></pre> <p>在上述结果中,fname,half_window_size,grade,pob,pfo,fbs,fss 分布记录了预报名称,采用的半窗口宽度,阈值,窗口概率观测值的平方的均值,窗口概率预报值的平方的均值,窗口概率误差(预报-观测)的平方的均值,以及对一个的fss取值。 其中每一个fss值对应一个时刻、一个时效、一种参数下评分。 如果要进行批量数据的统计分析,则可以采用如下方式,循环的调用fss函数,生成一组结果,由于fss结果是以sta_data形式存储,因此批量结果可以拼接再一起,进一步可以基于拼接的结果开展分析。</p> <pre><code class="language-python">#以下是批量读取观测和预报场数据并完成目标识别匹配和结果输出 dir_ob = r'H:\test_data\input\mem\mode\ob\rain03\YYMMDDHH.000.nc' dir_fo1 = r'H:\test_data\input\mem\mode\ec\rain03\YYMMDDHH.TTT.nc' dir_fo2 = r'H:\test_data\input\mem\mode\ncep\rain03\YYMMDDHH.TTT.nc' save_dir = r&amp;quot;H:\test_data\output\method\space\mode&amp;quot; time_s = datetime.datetime(2020,7,1,8,0) time_e = datetime.datetime(2020,7,10,8,0) grid0 = meb.grid([100,125,0.05],[20,35,0.05]) time_fo = time_s fss_list = [] while time_fo &amp;lt; time_e: print(time_fo) for dh in range(3,73,3): time_ob = time_fo + datetime.timedelta(hours = dh) path_ob = meb.get_path(dir_ob,time_ob) grd_ob1 = meb.read_griddata_from_nc(path_ob,time = time_ob,dtime = 0,level =0,data_name = &amp;quot;ob&amp;quot;,grid = grid0) path_fo = meb.get_path(dir_fo1,time_fo,dh) grd_fo1 = meb.read_griddata_from_nc(path_fo,time = time_fo,dtime = dh,level =0,data_name = &amp;quot;ECMWF&amp;quot;,grid = grid0) if grd_ob1 is not None and grd_fo1 is not None: fss_one = mem.fss(grd_ob1,grd_fo1,grade_list=grade_list,half_window_size_list = half_window_size_list) fss_list.append(fss_one) path_fo = meb.get_path(dir_fo2,time_fo,dh) grd_fo2 = meb.read_griddata_from_nc(path_fo,time = time_fo,dtime = dh,level =0,data_name = &amp;quot;NCEP&amp;quot;,grid = grid0) if grd_ob1 is not None and grd_fo2 is not None: fss_one = mem.fss(grd_ob1,grd_fo2,grade_list=grade_list,half_window_size_list = half_window_size_list) fss_list.append(fss_one) time_fo = time_fo + datetime.timedelta(hours = 12) fss_all = meb.concat(fss_list) print(fss_all)</code></pre> <pre><code>2020-07-01 08:00:00 2020-07-01 20:00:00 2020-07-02 08:00:00 2020-07-02 20:00:00 2020-07-03 08:00:00 2020-07-03 20:00:00 2020-07-04 08:00:00 2020-07-04 20:00:00 2020-07-05 08:00:00 2020-07-05 20:00:00 2020-07-06 08:00:00 2020-07-06 20:00:00 2020-07-07 08:00:00 2020-07-07 20:00:00 2020-07-08 08:00:00 2020-07-08 20:00:00 2020-07-09 08:00:00 2020-07-09 20:00:00 level time dtime id lon lat fname \ 0 NaN 2020-07-01 08:00:00 3 999999 NaN NaN ECMWF 1 NaN 2020-07-01 08:00:00 3 999999 NaN NaN ECMWF 2 NaN 2020-07-01 08:00:00 3 999999 NaN NaN ECMWF 3 NaN 2020-07-01 08:00:00 3 999999 NaN NaN ECMWF 4 NaN 2020-07-01 08:00:00 3 999999 NaN NaN ECMWF .. ... ... ... ... ... ... ... 43 NaN 2020-07-09 20:00:00 72 999999 NaN NaN NCEP 44 NaN 2020-07-09 20:00:00 72 999999 NaN NaN NCEP 45 NaN 2020-07-09 20:00:00 72 999999 NaN NaN NCEP 46 NaN 2020-07-09 20:00:00 72 999999 NaN NaN NCEP 47 NaN 2020-07-09 20:00:00 72 999999 NaN NaN NCEP half_window_size grade pob pfo fbs fss 0 5 0.1 0.194534 6.274214e-01 0.360689 0.561182 1 5 5.0 0.032807 3.808229e-02 0.037445 0.471784 2 5 10.0 0.013510 9.198992e-03 0.016541 0.271608 3 5 15.0 0.006045 2.196897e-03 0.007541 0.085087 4 5 20.0 0.002734 8.780202e-04 0.003548 0.017810 .. ... ... ... ... ... ... 43 40 5.0 0.006086 1.416543e-02 0.013688 0.324108 44 40 10.0 0.000715 5.947981e-04 0.001184 0.096224 45 40 15.0 0.000132 1.763409e-05 0.000147 0.016037 46 40 20.0 0.000014 4.863769e-08 0.000014 0.004465 47 40 25.0 0.000002 0.000000e+00 0.000002 0.000000 [41472 rows x 13 columns]</code></pre> <p>上述结果中包含了1种模式、60种起报时间、24种时效、2种等级、5种窗口的检验fss结果,以及相应的中间结果。</p> <pre><code class="language-python"># 从总结果中挑选某一种参数的结果进行绘图分析。 fss_one_para = meb.sele_by_para(fss_all,grade =[5], half_window_size =[10],fname = [&amp;quot;ECMWF&amp;quot;]) # 选取一种参数的结果 fss_part = meb.sele_by_para(fss_one_para,member = [&amp;quot;fss&amp;quot;]) # 选取fss评分 print(fss_part)</code></pre> <pre><code> level time dtime id lon lat fss 7 NaN 2020-07-01 08:00:00 3 999999 NaN NaN 0.567072 7 NaN 2020-07-01 08:00:00 6 999999 NaN NaN 0.518486 7 NaN 2020-07-01 08:00:00 9 999999 NaN NaN 0.530505 7 NaN 2020-07-01 08:00:00 12 999999 NaN NaN 0.312065 7 NaN 2020-07-01 08:00:00 15 999999 NaN NaN 0.318812 .. ... ... ... ... ... ... ... 7 NaN 2020-07-09 20:00:00 60 999999 NaN NaN 0.672741 7 NaN 2020-07-09 20:00:00 63 999999 NaN NaN 0.578186 7 NaN 2020-07-09 20:00:00 66 999999 NaN NaN 0.360016 7 NaN 2020-07-09 20:00:00 69 999999 NaN NaN 0.214642 7 NaN 2020-07-09 20:00:00 72 999999 NaN NaN 0.188977 [432 rows x 7 columns]</code></pre> <pre><code class="language-python">meb.tool.plot_tools.mesh_time_dtime(fss_part,annot = -1,title = [&amp;quot;FSS评分随时间和时效变化&amp;quot;],cmap = &amp;quot;ts&amp;quot;) # 将所选结果绘制成图形</code></pre> <p><img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=abe33d2f3a5faad9977d390f78a6be68&amp;amp;file=file.png" alt="" /></p> <p>若需计算某种时效下,所有时间数据的总体的fss,不能直接将fss取平均,需要先对 pob、 pfo、 fbs列先求和,再利用fss = 1 - fbs/(pob+pfo)来计算</p> <pre><code class="language-python">fss_one_para = meb.sele_by_para(fss_all,grade =[5], half_window_size =[10],dtime = 3,fname = [&amp;quot;ECMWF&amp;quot;]) # 选取一种参数的结果 fss_03 = 1 - fss_one_para[&amp;quot;fbs&amp;quot;].sum()/(fss_one_para[&amp;quot;pob&amp;quot;].sum()+fss_one_para[&amp;quot;pfo&amp;quot;].sum()) print(&amp;quot;设置半窗口为10,阈值为5mm,则时效为3的所有起报时间的预报的总体fss为:&amp;quot;) print(fss_03)</code></pre> <pre><code>设置半窗口为10,阈值为5mm,则时效为3的所有起报时间的预报的总体fss为: 0.713404057345036</code></pre> <h1>绘制fss评分随阈值和窗口尺度的变化图</h1> <pre><code class="language-python">result = mps.score_df(fss_all,mem.fss,g = [&amp;quot;fname&amp;quot;,&amp;quot;half_window_size&amp;quot;,&amp;quot;grade&amp;quot;],plot = &amp;quot;mesh&amp;quot;,ncol = 2)</code></pre> <p><img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=fa2822ad215afb9b37a64497e61c625b&amp;amp;file=file.png" alt="" /></p>

页面列表

ITEM_HTML