{"id":331341,"date":"2022-03-31T21:00:18","date_gmt":"2022-03-31T21:00:18","guid":{"rendered":"http:\/\/savepearlharbor.com\/?p=331341"},"modified":"-0001-11-30T00:00:00","modified_gmt":"-0001-11-29T21:00:00","slug":"","status":"publish","type":"post","link":"https:\/\/savepearlharbor.com\/?p=331341","title":{"rendered":"<span>\u041a\u0430\u043a \u043f\u0440\u043e\u0432\u0435\u0440\u0438\u0442\u044c \u0434\u0430\u043d\u043d\u044b\u0435 \u0432\u043e \u0444\u0440\u0435\u0439\u043c\u0435 Pandas \u0441 \u043f\u043e\u043c\u043e\u0449\u044c\u044e Pandera<\/span>"},"content":{"rendered":"<div><\/div>\n<div id=\"post-content-body\">\n<div>\n<div class=\"article-formatted-body article-formatted-body_version-2\">\n<div xmlns=\"http:\/\/www.w3.org\/1999\/xhtml\">\n<h2>\u0423\u0431\u0435\u0434\u0438\u0442\u0435\u0441\u044c, \u0447\u0442\u043e \u0434\u0430\u043d\u043d\u044b\u0435 \u0441\u043e\u043e\u0442\u0432\u0435\u0442\u0441\u0442\u0432\u0443\u044e\u0442 \u043e\u0436\u0438\u0434\u0430\u043d\u0438\u044f\u043c<\/h2>\n<figure class=\"full-width\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/791\/5c3\/43c\/7915c343c04bc5fbb90301891750593f.png\" width=\"803\" height=\"438\" data-src=\"https:\/\/habrastorage.org\/getpro\/habr\/upload_files\/791\/5c3\/43c\/7915c343c04bc5fbb90301891750593f.png\"\/><figcaption><\/figcaption><\/figure>\n<p>\u0412 \u043d\u0430\u0443\u043a\u0435 \u043e \u0434\u0430\u043d\u043d\u044b\u0445 \u0432\u0430\u0436\u043d\u043e \u0442\u0435\u0441\u0442\u0438\u0440\u043e\u0432\u0430\u0442\u044c \u043d\u0435 \u0442\u043e\u043b\u044c\u043a\u043e \u0444\u0443\u043d\u043a\u0446\u0438\u0438, \u043d\u043e \u0438 \u0434\u0430\u043d\u043d\u044b\u0435, \u0447\u0442\u043e\u0431\u044b \u0443\u0431\u0435\u0434\u0438\u0442\u044c\u0441\u044f, \u0447\u0442\u043e \u043e\u043d\u0438 \u0440\u0430\u0431\u043e\u0442\u0430\u044e\u0442 \u0442\u0430\u043a, \u043a\u0430\u043a \u0432\u044b \u043e\u0436\u0438\u0434\u0430\u043b\u0438. \u041c\u0430\u0442\u0435\u0440\u0438\u0430\u043b\u043e\u043c \u043e \u043f\u0440\u043e\u0441\u0442\u043e\u0439 \u0431\u0438\u0431\u043b\u0438\u043e\u0442\u0435\u043a\u0435 <a href=\"https:\/\/pandera.readthedocs.io\/en\/stable\/\">Pandera<\/a> \u0434\u043b\u044f \u0432\u0430\u043b\u0438\u0434\u0430\u0446\u0438\u0438 \u0444\u0440\u0435\u0439\u043c\u043e\u0432 \u0434\u0430\u043d\u043d\u044b\u0445 Pandas \u0434\u0435\u043b\u0438\u043c\u0441\u044f \u043a \u0441\u0442\u0430\u0440\u0442\u0443 \u0444\u043b\u0430\u0433\u043c\u0430\u043d\u0441\u043a\u043e\u0433\u043e <a href=\"https:\/\/skillfactory.ru\/data-scientist-pro?utm_source=habr&amp;utm_medium=habr&amp;utm_campaign=article&amp;utm_content=data-science_dspr_310322&amp;utm_term=lead\">\u043a\u0443\u0440\u0441\u0430 \u043f\u043e Data Science<\/a>.<\/p>\n<hr\/>\n<p>\u0427\u0442\u043e\u0431\u044b \u0443\u0441\u0442\u0430\u043d\u043e\u0432\u0438\u0442\u044c Pandera, \u0432 \u0442\u0435\u0440\u043c\u0438\u043d\u0430\u043b\u0435 \u043d\u0430\u0431\u0435\u0440\u0438\u0442\u0435:<\/p>\n<pre><code class=\"bash\">pip install pandera<\/code><\/pre>\n<h2>\u0412\u0432\u0435\u0434\u0435\u043d\u0438\u0435<\/h2>\n<p>\u041d\u0430\u0447\u043d\u0451\u043c \u0441 \u043f\u0440\u043e\u0441\u0442\u043e\u0433\u043e \u043d\u0430\u0431\u043e\u0440\u0430 \u0434\u0430\u043d\u043d\u044b\u0445, \u0447\u0442\u043e\u0431\u044b \u043f\u043e\u043d\u044f\u0442\u044c, \u043a\u0430\u043a \u0440\u0430\u0431\u043e\u0442\u0430\u0435\u0442 Pandera:<\/p>\n<pre><code class=\"python\">import pandas as pd  fruits = pd.DataFrame(     {         \"name\": [\"apple\", \"banana\", \"apple\", \"orange\"],         \"store\": [\"Aldi\", \"Walmart\", \"Walmart\", \"Aldi\"],         \"price\": [2, 1, 3, 4],     } )  fruits<\/code><\/pre>\n<figure class=\"\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/332\/09b\/cc6\/33209bcc6c928fe36711b2d6be53e124.png\" width=\"453\" height=\"334\" data-src=\"https:\/\/habrastorage.org\/getpro\/habr\/upload_files\/332\/09b\/cc6\/33209bcc6c928fe36711b2d6be53e124.png\"\/><figcaption><\/figcaption><\/figure>\n<p>\u041f\u0440\u0435\u0434\u0441\u0442\u0430\u0432\u044c\u0442\u0435: \u0432\u0430\u0448 \u043c\u0435\u043d\u0435\u0434\u0436\u0435\u0440 \u0441\u043a\u0430\u0437\u0430\u043b \u0432\u0430\u043c, \u0447\u0442\u043e \u0432 \u043d\u0430\u0431\u043e\u0440\u0435 \u0434\u0430\u043d\u043d\u044b\u0445 \u043c\u043e\u0433\u0443\u0442 \u0445\u0440\u0430\u043d\u0438\u0442\u044c\u0441\u044f \u0442\u043e\u043b\u044c\u043a\u043e \u043e\u043f\u0440\u0435\u0434\u0435\u043b\u0451\u043d\u043d\u044b\u0435 \u0444\u0440\u0443\u043a\u0442\u044b, \u0430 \u0437\u043d\u0430\u0447\u0435\u043d\u0438\u0435 \u0438\u0445 \u0446\u0435\u043d\u044b \u0434\u043e\u043b\u0436\u043d\u043e \u0431\u044b\u0442\u044c \u043c\u0435\u043d\u044c\u0448\u0435 4:<\/p>\n<pre><code class=\"python\">available_fruits = [\"apple\", \"banana\", \"orange\"] nearby_stores = [\"Aldi\", \"Walmart\"]<\/code><\/pre>\n<p>\u041f\u0440\u043e\u0432\u0435\u0440\u043a\u0430 \u0434\u0430\u043d\u043d\u044b\u0445 \u0432\u0440\u0443\u0447\u043d\u0443\u044e \u043c\u043e\u0436\u0435\u0442 \u0437\u0430\u043d\u044f\u0442\u044c \u043c\u043d\u043e\u0433\u043e \u0432\u0440\u0435\u043c\u0435\u043d\u0438, \u043e\u0441\u043e\u0431\u0435\u043d\u043d\u043e \u043a\u043e\u0433\u0434\u0430 \u0438\u0445 \u043c\u043d\u043e\u0433\u043e. \u0415\u0441\u0442\u044c \u043b\u0438 \u0441\u043f\u043e\u0441\u043e\u0431 \u0430\u0432\u0442\u043e\u043c\u0430\u0442\u0438\u0437\u0438\u0440\u043e\u0432\u0430\u0442\u044c \u043f\u0440\u043e\u0432\u0435\u0440\u043a\u0443? \u0414\u0430, \u0437\u0434\u0435\u0441\u044c \u0438 \u043f\u0440\u0438\u0433\u043e\u0434\u0438\u0442\u0441\u044f Pandera:<\/p>\n<ul>\n<li>\n<p>\u0441\u043e\u0437\u0434\u0430\u0434\u0438\u043c \u0442\u0435\u0441\u0442\u044b \u0432\u0441\u0435\u0433\u043e \u043d\u0430\u0431\u043e\u0440\u0430 \u0434\u0430\u043d\u043d\u044b\u0445 \u0441 \u043f\u043e\u043c\u043e\u0449\u044c\u044e DataFrameSchema;<\/p>\n<\/li>\n<li>\n<p>\u0442\u0435\u0441\u0442\u044b \u0434\u043b\u044f \u043a\u0430\u0436\u0434\u043e\u0439 \u043a\u043e\u043b\u043e\u043d\u043a\u0438 \u2014 \u043f\u0440\u0438 \u043f\u043e\u043c\u043e\u0449\u0438 Column;<\/p>\n<\/li>\n<li>\n<p>\u0442\u0438\u043f \u0442\u0435\u0441\u0442\u0430 \u043e\u043f\u0440\u0435\u0434\u0435\u043b\u0438\u043c \u043f\u0440\u0438 \u043f\u043e\u043c\u043e\u0449\u0438 Check.<\/p>\n<\/li>\n<\/ul>\n<pre><code class=\"python\">import pandera as pa from pandera import Column, Check  schema = pa.DataFrameSchema(     {         \"name\": Column(str, Check.isin(available_fruits)),         \"store\": Column(str, Check.isin(nearby_stores)),         \"price\": Column(int, Check.less_than(4)),     } ) schema.validate(fruits)<\/code><\/pre>\n<pre><code class=\"bash\">SchemaError: &lt;Schema Column(name=price, type=DataType(int64))> failed element-wise validator 0: &lt;Check less_than: less_than(4)> failure cases:    index  failure_case 0      3             4<\/code><\/pre>\n<p>\u041f\u043e\u044f\u0441\u043d\u044e \u044d\u0442\u043e\u0442 \u043a\u043e\u0434:<\/p>\n<ul>\n<li>\n<p><code>\"name\": Column(str, Check.isin(available_fruits))<\/code> \u043f\u0440\u043e\u0432\u0435\u0440\u044f\u0435\u0442, \u0438\u043c\u0435\u0435\u0442 \u043b\u0438 \u0441\u0442\u043e\u043b\u0431\u0435\u0446 name \u0442\u0438\u043f string \u0438 \u0432\u0441\u0435 \u043b\u0438 \u0437\u043d\u0430\u0447\u0435\u043d\u0438\u044f \u0441\u0442\u043e\u043b\u0431\u0446\u0430 name \u043d\u0430\u0445\u043e\u0434\u044f\u0442\u0441\u044f \u0432\u043d\u0443\u0442\u0440\u0438 \u0443\u043a\u0430\u0437\u0430\u043d\u043d\u043e\u0433\u043e \u0441\u043f\u0438\u0441\u043a\u0430;<\/p>\n<\/li>\n<li>\n<p><code>\"price\": Column(int, Check.less_than(4))<\/code> \u043f\u0440\u043e\u0432\u0435\u0440\u044f\u0435\u0442, \u0432\u0441\u0435 \u043b\u0438 \u0437\u043d\u0430\u0447\u0435\u043d\u0438\u044f \u0432 \u0441\u0442\u043e\u043b\u0431\u0446\u0435 price \u0438\u043c\u0435\u044e\u0442 \u0442\u0438\u043f int \u0438 \u043c\u0435\u043d\u044c\u0448\u0435 4;<\/p>\n<\/li>\n<li>\n<p>\u043d\u0435 \u0432\u0441\u0435 \u0437\u043d\u0430\u0447\u0435\u043d\u0438\u044f \u0432 \u0441\u0442\u043e\u043b\u0431\u0446\u0435 price \u043c\u0435\u043d\u044c\u0448\u0435 4, \u043f\u043e\u044d\u0442\u043e\u043c\u0443 \u0442\u0435\u0441\u0442 \u043d\u0435 \u043f\u0440\u043e\u0445\u043e\u0434\u0438\u0442.<\/p>\n<\/li>\n<\/ul>\n<p>\u0414\u0440\u0443\u0433\u0438\u0435 \u0432\u0441\u0442\u0440\u043e\u0435\u043d\u043d\u044b\u0435 \u043c\u0435\u0442\u043e\u0434\u044b Checks \u0432\u044b \u043d\u0430\u0439\u0434\u0451\u0442\u0435 <a href=\"https:\/\/pandera.readthedocs.io\/en\/stable\/reference\/generated\/pandera.checks.Check.html#pandera-checks-check\">\u0437\u0434\u0435\u0441\u044c<\/a>.<\/p>\n<h3>\u041d\u0430\u0441\u0442\u0440\u0430\u0438\u0432\u0430\u0435\u043c\u044b\u0435 \u043f\u0440\u043e\u0432\u0435\u0440\u043a\u0438<\/h3>\n<p>\u041f\u0440\u043e\u0432\u0435\u0440\u043a\u0438 \u043c\u043e\u0436\u043d\u043e \u043f\u0438\u0441\u0430\u0442\u044c \u0438 \u0447\u0435\u0440\u0435\u0437 \u043b\u044f\u043c\u0431\u0434\u0430-\u0432\u044b\u0440\u0430\u0436\u0435\u043d\u0438\u044f. \u0412 \u043a\u043e\u0434\u0435 \u043d\u0438\u0436\u0435 Check(lambda price: sum(price) &lt; 20) \u043f\u0440\u043e\u0432\u0435\u0440\u044f\u0435\u0442, \u043c\u0435\u043d\u044c\u0448\u0435 \u043b\u0438 20 \u0441\u0443\u043c\u043c\u0430 \u0432 price.<\/p>\n<pre><code class=\"python\">schema = pa.DataFrameSchema(     {         \"name\": Column(str, Check.isin(available_fruits)),         \"store\": Column(str, Check.isin(nearby_stores)),         \"price\": Column(             int, [Check.less_than(5), Check(lambda price: sum(price) &lt; 20)]         ),     } ) schema.validate(fruits)<\/code><\/pre>\n<h2>SchemaModel<\/h2>\n<p>\u041a\u043e\u0433\u0434\u0430 \u0442\u0435\u0441\u0442\u044b \u0441\u043b\u043e\u0436\u043d\u044b\u0435, \u0447\u0438\u0449\u0435 \u043a\u043e\u0434 \u0441\u0434\u0435\u043b\u0430\u044e\u0442 \u043d\u0435 \u0441\u043b\u043e\u0432\u0430\u0440\u0438, \u0430 \u043a\u043b\u0430\u0441\u0441\u044b \u0434\u0430\u043d\u043d\u044b\u0445. \u041a \u0441\u0447\u0430\u0441\u0442\u044c\u044e, Pandera \u043f\u043e\u0437\u0432\u043e\u043b\u044f\u0435\u0442 \u0441\u043e\u0437\u0434\u0430\u0432\u0430\u0442\u044c \u0442\u0435\u0441\u0442\u044b \u0441 \u043a\u043b\u0430\u0441\u0441\u0430\u043c\u0438 \u0434\u0430\u043d\u043d\u044b\u0445.<\/p>\n<pre><code class=\"python\">from pandera.typing import Series  class Schema(pa.SchemaModel):     name: Series[str] = pa.Field(isin=available_fruits)     store: Series[str] = pa.Field(isin=nearby_stores)     price: Series[int] = pa.Field(le=5)      @pa.check(\"price\")     def price_sum_lt_20(cls, price: Series[int]) -> Series[bool]:         return sum(price) &lt; 20  Schema.validate(fruits)<\/code><\/pre>\n<h2>\u0414\u0435\u043a\u043e\u0440\u0430\u0442\u043e\u0440 \u0432\u0430\u043b\u0438\u0434\u0430\u0446\u0438\u0438<\/h2>\n<h3>\u041f\u0440\u043e\u0432\u0435\u0440\u043a\u0430 \u0432\u0432\u043e\u0434\u0430<\/h3>\n<p>\u041a\u0430\u043a \u0442\u0435\u0441\u0442\u0438\u0440\u043e\u0432\u0430\u0442\u044c \u0432\u0445\u043e\u0434\u043d\u044b\u0435 \u0437\u043d\u0430\u0447\u0435\u043d\u0438\u044f \u0444\u0443\u043d\u043a\u0446\u0438\u0438? \u041f\u0440\u044f\u043c\u043e\u043b\u0438\u043d\u0435\u0439\u043d\u044b\u0439 \u043f\u043e\u0434\u0445\u043e\u0434 \u2014 \u0434\u043e\u0431\u0430\u0432\u0438\u0442\u044c schema.validate(input) \u043f\u0440\u044f\u043c\u043e \u0432 \u0444\u0443\u043d\u043a\u0446\u0438\u044e:<\/p>\n<pre><code class=\"python\">fruits = pd.DataFrame(     {         \"name\": [\"apple\", \"banana\", \"apple\", \"orange\"],         \"store\": [\"Aldi\", \"Walmart\", \"Walmart\", \"Aldi\"],         \"price\": [2, 1, 3, 4],     } )  schema = pa.DataFrameSchema(     {         \"name\": Column(str, Check.isin(available_fruits)),         \"store\": Column(str, Check.isin(nearby_stores)),         \"price\": Column(int, Check.less_than(5)),     } )   def get_total_price(fruits: pd.DataFrame, schema: pa.DataFrameSchema):     validated = schema.validate(fruits)     return validated[\"price\"].sum()   get_total_price(fruits, schema)<\/code><\/pre>\n<p>\u041d\u043e \u043e\u043d \u043e\u0441\u043b\u043e\u0436\u043d\u044f\u0435\u0442 \u0442\u0435\u0441\u0442\u0438\u0440\u043e\u0432\u0430\u043d\u0438\u0435. \u0424\u0443\u043d\u043a\u0446\u0438\u044f get_total_price \u0438\u043c\u0435\u0435\u0442 \u0430\u0440\u0433\u0443\u043c\u0435\u043d\u0442\u044b fruits and schema, \u0430 \u0437\u043d\u0430\u0447\u0438\u0442, \u0432 \u0442\u0435\u0441\u0442 \u0444\u0443\u043d\u043a\u0446\u0438\u0438 \u043d\u0443\u0436\u043d\u043e \u0432\u043a\u043b\u044e\u0447\u0438\u0442\u044c \u043e\u0431\u0430:<\/p>\n<pre><code class=\"bash\">def test_get_total_price():     fruits = pd.DataFrame({'name': ['apple', 'banana'], 'store': ['Aldi', 'Walmart'], 'price': [1, 2]})          # Need to include schema in the unit test     schema = pa.DataFrameSchema(         {             \"name\": Column(str, Check.isin(available_fruits)),             \"store\": Column(str, Check.isin(nearby_stores)),             \"price\": Column(int, Check.less_than(5)),         }     )     assert get_total_price(fruits, schema) == 3<\/code><\/pre>\n<p>\u0424\u0443\u043d\u043a\u0446\u0438\u044f test_get_total_price \u043f\u0440\u043e\u0432\u0435\u0440\u044f\u0435\u0442 \u0438 \u0434\u0430\u043d\u043d\u044b\u0435, \u0438 \u0444\u0443\u043d\u043a\u0446\u0438\u044e. \u041c\u043e\u0434\u0443\u043b\u044c\u043d\u044b\u0439 \u0442\u0435\u0441\u0442 \u0434\u043e\u043b\u0436\u0435\u043d \u043f\u0440\u043e\u0432\u0435\u0440\u044f\u0442\u044c \u0442\u043e\u043b\u044c\u043a\u043e \u043e\u0434\u043d\u0443 \u0432\u0435\u0449\u044c, \u043f\u043e\u044d\u0442\u043e\u043c\u0443 \u0432\u043a\u043b\u044e\u0447\u0435\u043d\u0438\u0435 \u043f\u0440\u043e\u0432\u0435\u0440\u043a\u0438 \u0434\u0430\u043d\u043d\u044b\u0445 \u0432\u043d\u0443\u0442\u0440\u0438 \u0444\u0443\u043d\u043a\u0446\u0438\u0438 \u2014 \u043d\u0435 \u0438\u0434\u0435\u0430\u043b\u044c\u043d\u043e\u0435 \u0440\u0435\u0448\u0435\u043d\u0438\u0435.<\/p>\n<p>\u042d\u0442\u0443 \u043f\u0440\u043e\u0431\u043b\u0435\u043c\u0443 Pandera \u0440\u0435\u0448\u0430\u0435\u0442 \u0434\u0435\u043a\u043e\u0440\u0430\u0442\u043e\u0440\u043e\u043c check_input. \u0410\u0440\u0433\u0443\u043c\u0435\u043d\u0442 \u0434\u0435\u043a\u043e\u0440\u0430\u0442\u043e\u0440\u0430 \u043f\u0440\u0438\u043c\u0435\u043d\u044f\u0435\u0442\u0441\u044f \u0432 \u0432\u0430\u043b\u0438\u0434\u0430\u0446\u0438\u0438 \u0432\u0445\u043e\u0434\u043d\u044b\u0445 \u0437\u043d\u0430\u0447\u0435\u043d\u0438\u0439:<\/p>\n<pre><code class=\"python\">from pandera import check_input  @check_input(schema) def get_total_price(fruits: pd.DataFrame):     return fruits.price.sum()  get_total_price(fruits)<\/code><\/pre>\n<p>\u0415\u0441\u043b\u0438 \u0432\u0445\u043e\u0434\u043d\u043e\u0435 \u0437\u043d\u0430\u0447\u0435\u043d\u0438\u0435 \u043d\u0435\u043a\u043e\u0440\u0440\u0435\u043a\u0442\u043d\u043e, Pandera \u043f\u043e\u0434\u043d\u0438\u043c\u0430\u0435\u0442 \u0438\u0441\u043a\u043b\u044e\u0447\u0435\u043d\u0438\u0435 \u0434\u043e \u043e\u0431\u0440\u0430\u0431\u043e\u0442\u043a\u0438 \u0437\u043d\u0430\u0447\u0435\u043d\u0438\u044f \u0432 \u0444\u0443\u043d\u043a\u0446\u0438\u0438:<\/p>\n<pre><code class=\"python\">fruits = pd.DataFrame(     {         \"name\": [\"apple\", \"banana\", \"apple\", \"orange\"],         \"store\": [\"Aldi\", \"Walmart\", \"Walmart\", \"Aldi\"],         \"price\": [\"2\", \"1\", \"3\", \"4\"],     } )  @check_input(schema) def get_total_price(fruits: pd.DataFrame):     return fruits.price.sum()  get_total_price(fruits)<\/code><\/pre>\n<pre><code class=\"bash\">SchemaError: error in check_input decorator of function 'get_total_price': expected series 'price' to have type int64, got object<\/code><\/pre>\n<p>\u0422\u0430\u043a\u0430\u044f \u043f\u0440\u043e\u0432\u0435\u0440\u043a\u0430 \u0434\u043e \u043e\u0431\u0440\u0430\u0431\u043e\u0442\u043a\u0438 \u0432 \u0444\u0443\u043d\u043a\u0446\u0438\u0438 \u044d\u043a\u043e\u043d\u043e\u043c\u0438\u0442 \u043c\u043d\u043e\u0433\u043e \u0432\u0440\u0435\u043c\u0435\u043d\u0438.<\/p>\n<h3>\u041f\u0440\u043e\u0432\u0435\u0440\u043a\u0430 \u0432\u044b\u0432\u043e\u0434\u0430<\/h3>\n<p>\u0414\u043b\u044f \u043f\u0440\u043e\u0432\u0435\u0440\u043a\u0438 \u0432\u044b\u0432\u043e\u0434\u0430 \u043c\u043e\u0436\u043d\u043e \u0438\u0441\u043f\u043e\u043b\u044c\u0437\u043e\u0432\u0430\u0442\u044c \u0434\u0435\u043a\u043e\u0440\u0430\u0442\u043e\u0440 check_output:<\/p>\n<pre><code class=\"python\">from pandera import check_output  fruits_nearby = pd.DataFrame(     {         \"name\": [\"apple\", \"banana\", \"apple\", \"orange\"],         \"store\": [\"Aldi\", \"Walmart\", \"Walmart\", \"Aldi\"],         \"price\": [2, 1, 3, 4],     } )  fruits_faraway = pd.DataFrame(     {         \"name\": [\"apple\", \"banana\", \"apple\", \"orange\"],         \"store\": [\"Whole Foods\", \"Whole Foods\", \"Schnucks\", \"Schnucks\"],         \"price\": [3, 2, 4, 5],     } )  out_schema = pa.DataFrameSchema(     {\"store\": Column(str, Check.isin([\"Aldi\", \"Walmart\", \"Whole Foods\", \"Schnucks\"]))} )   @check_output(out_schema) def combine_fruits(fruits_nearby: pd.DataFrame, fruits_faraway: pd.DataFrame):     fruits = pd.concat([fruits_nearby, fruits_faraway])     return fruits   combine_fruits(fruits_nearby, fruits_faraway)<\/code><\/pre>\n<h3>\u041f\u0440\u043e\u0432\u0435\u0440\u043a\u0430 \u0432\u0432\u043e\u0434\u0430 \u0438 \u0432\u044b\u0432\u043e\u0434\u0430<\/h3>\n<p>\u041f\u0440\u043e\u0432\u0435\u0440\u0438\u0442\u044c \u0432\u0445\u043e\u0434\u043d\u044b\u0435 \u0438 \u0432\u044b\u0445\u043e\u0434\u043d\u044b\u0435 \u0434\u0430\u043d\u043d\u044b\u0435 \u043c\u043e\u0436\u043d\u043e \u0441 \u043f\u043e\u043c\u043e\u0449\u044c\u044e \u0434\u0435\u043a\u043e\u0440\u0430\u0442\u043e\u0440\u0430 check_io:<\/p>\n<pre><code class=\"python\">from pandera import check_io  in_schema = pa.DataFrameSchema({\"store\": Column(str)})  out_schema = pa.DataFrameSchema(     {\"store\": Column(str, Check.isin([\"Aldi\", \"Walmart\", \"Whole Foods\", \"Schnucks\"]))} )   @check_io(fruits_nearby=in_schema, fruits_faraway=in_schema, out=out_schema) def combine_fruits(fruits_nearby: pd.DataFrame, fruits_faraway: pd.DataFrame):     fruits = pd.concat([fruits_nearby, fruits_faraway])     return fruits   combine_fruits(fruits_nearby, fruits_faraway)<\/code><\/pre>\n<h2>\u0414\u0440\u0443\u0433\u0438\u0435 \u0430\u0440\u0433\u0443\u043c\u0435\u043d\u0442\u044b \u043f\u0440\u043e\u0432\u0435\u0440\u043a\u0438 \u0441\u0442\u043e\u043b\u0431\u0446\u043e\u0432<\/h2>\n<h3>Null<\/h3>\n<p>\u041f\u043e \u0443\u043c\u043e\u043b\u0447\u0430\u043d\u0438\u044e Pandera \u0432\u044b\u0434\u0430\u0451\u0442 \u043e\u0448\u0438\u0431\u043a\u0443, \u0435\u0441\u043b\u0438 \u0432 \u043f\u0440\u043e\u0432\u0435\u0440\u044f\u0435\u043c\u043e\u043c \u0441\u0442\u043e\u043b\u0431\u0446\u0435 \u0435\u0441\u0442\u044c Null. \u0415\u0441\u043b\u0438 \u043d\u0443\u043b\u0435\u0432\u044b\u0435 \u0437\u043d\u0430\u0447\u0435\u043d\u0438\u044f \u0434\u043e\u043f\u0443\u0441\u0442\u0438\u043c\u044b, \u0432 \u043a\u043b\u0430\u0441\u0441 Column \u0434\u043e\u0431\u0430\u0432\u044c\u0442\u0435 nullable=True:<\/p>\n<pre><code class=\"python\">import numpy as np  fruits = fruits = pd.DataFrame(     {         \"name\": [\"apple\", \"banana\", \"apple\", \"orange\"],         \"store\": [\"Aldi\", \"Walmart\", \"Walmart\", np.nan],         \"price\": [2, 1, 3, 4],     } )  schema = pa.DataFrameSchema(     {         \"name\": Column(str, Check.isin(available_fruits)),         \"store\": Column(str, Check.isin(nearby_stores), nullable=True),         \"price\": Column(int, Check.less_than(5)),     } ) schema.validate(fruits)<\/code><\/pre>\n<h3>\u0414\u0443\u0431\u043b\u0438\u043a\u0430\u0442\u044b<\/h3>\n<p>\u041f\u043e \u0443\u043c\u043e\u043b\u0447\u0430\u043d\u0438\u044e \u0434\u0443\u0431\u043b\u0438\u043a\u0430\u0442\u044b \u0434\u043e\u043f\u0443\u0441\u0442\u0438\u043c\u044b. \u0427\u0442\u043e\u0431\u044b \u043e\u043d\u0438 \u043f\u043e\u0434\u043d\u0438\u043c\u0430\u043b\u0438 \u0438\u0441\u043a\u043b\u044e\u0447\u0435\u043d\u0438\u0435, \u0434\u043e\u0431\u0430\u0432\u044c\u0442\u0435 \u0430\u0440\u0433\u0443\u043c\u0435\u043d\u0442 allow_duplicates=False:<\/p>\n<pre><code class=\"python\">schema = pa.DataFrameSchema(     {         \"name\": Column(str, Check.isin(available_fruits)),         \"store\": Column(             str, Check.isin(nearby_stores), nullable=True, allow_duplicates=False         ),         \"price\": Column(int, Check.less_than(5)),     } ) schema.validate(fruits)<\/code><\/pre>\n<pre><code class=\"bash\">SchemaError: series 'store' contains duplicate values: {2: 'Walmart'}<\/code><\/pre>\n<h3>\u041f\u0440\u0435\u043e\u0431\u0440\u0430\u0437\u043e\u0432\u0430\u043d\u0438\u0435 \u0442\u0438\u043f\u043e\u0432 \u0434\u0430\u043d\u043d\u044b\u0445<\/h3>\n<p>\u0410\u0440\u0433\u0443\u043c\u0435\u043d\u0442 coerce=True \u0438\u0437\u043c\u0435\u043d\u044f\u0435\u0442 \u0442\u0438\u043f \u0434\u0430\u043d\u043d\u044b\u0445 \u0441\u0442\u043e\u043b\u0431\u0446\u0430, \u0435\u0441\u043b\u0438 \u0442\u0438\u043f \u043d\u0435 \u0443\u0434\u043e\u0432\u043b\u0435\u0442\u0432\u043e\u0440\u044f\u0435\u0442 \u0443\u0441\u043b\u043e\u0432\u0438\u044e \u043f\u0440\u043e\u0432\u0435\u0440\u043a\u0438.<\/p>\n<p>\u0412 \u043a\u043e\u0434\u0435 \u043d\u0438\u0436\u0435 \u0442\u0438\u043f \u0434\u0430\u043d\u043d\u044b\u0445 \u0446\u0435\u043d\u044b \u0438\u0437\u043c\u0435\u043d\u0451\u043d \u0441 \u0446\u0435\u043b\u043e\u0433\u043e \u043d\u0430 \u0441\u0442\u0440\u043e\u043a\u0443:<\/p>\n<pre><code class=\"python\">fruits = pd.DataFrame(     {         \"name\": [\"apple\", \"banana\", \"apple\", \"orange\"],         \"store\": [\"Aldi\", \"Walmart\", \"Walmart\", \"Aldi\"],         \"price\": [2, 1, 3, 4],     } )  schema = pa.DataFrameSchema({\"price\": Column(str, coerce=True)}) validated = schema.validate(fruits) validated.dtypes<\/code><\/pre>\n<pre><code class=\"bash\">name     object store    object price    object dtype: object<\/code><\/pre>\n<h2>\u0421\u043e\u043f\u043e\u0441\u0442\u0430\u0432\u043b\u0435\u043d\u0438\u0435 \u0448\u0430\u0431\u043b\u043e\u043d\u043e\u0432<\/h2>\n<p>\u0427\u0442\u043e, \u0435\u0441\u043b\u0438 \u043c\u044b \u0445\u043e\u0442\u0438\u043c \u0438\u0437\u043c\u0435\u043d\u0438\u0442\u044c \u0432\u0441\u0435 \u0441\u0442\u043e\u043b\u0431\u0446\u044b, \u043a\u043e\u0442\u043e\u0440\u044b\u0435 \u043d\u0430\u0447\u0438\u043d\u0430\u044e\u0442\u0441\u044f \u0441\u043e \u0441\u043b\u043e\u0432\u0430 store?<\/p>\n<pre><code class=\"python\">favorite_stores = [\"Aldi\", \"Walmart\", \"Whole Foods\", \"Schnucks\"]  fruits = pd.DataFrame(     {         \"name\": [\"apple\", \"banana\", \"apple\", \"orange\"],         \"store_nearby\": [\"Aldi\", \"Walmart\", \"Walmart\", \"Aldi\"],         \"store_far\": [\"Whole Foods\", \"Schnucks\", \"Whole Foods\", \"Schnucks\"],     } )<\/code><\/pre>\n<p>Pandera \u043f\u043e\u0437\u0432\u043e\u043b\u044f\u0435\u0442 \u043d\u0430\u043c \u043f\u0440\u0438\u043c\u0435\u043d\u044f\u0442\u044c \u043e\u0434\u043d\u0438 \u0438 \u0442\u0435 \u0436\u0435 \u043f\u0440\u043e\u0432\u0435\u0440\u043a\u0438 \u043a \u043d\u0435\u0441\u043a\u043e\u043b\u044c\u043a\u0438\u043c \u0441\u0442\u043e\u043b\u0431\u0446\u0430\u043c \u0441 \u043e\u043f\u0440\u0435\u0434\u0435\u043b\u0451\u043d\u043d\u044b\u043c \u0448\u0430\u0431\u043b\u043e\u043d\u043e\u043c, \u0432\u043e\u0442 \u0442\u0430\u043a: regex=True:<\/p>\n<pre><code class=\"python\">schema = pa.DataFrameSchema(     {         \"name\": Column(str, Check.isin(available_fruits)),         \"store_+\": Column(str, Check.isin(favorite_stores), regex=True),     } ) schema.validate(fruits)<\/code><\/pre>\n<h2>\u042d\u043a\u0441\u043f\u043e\u0440\u0442 \u0438 \u0437\u0430\u0433\u0440\u0443\u0437\u043a\u0430 \u0438\u0437 \u0444\u0430\u0439\u043b\u0430 YAML<\/h2>\n<h3>\u042d\u043a\u0441\u043f\u043e\u0440\u0442 \u0432 YAML<\/h3>\n<p>YAML \u2014 \u043e\u0442\u043b\u0438\u0447\u043d\u044b\u0439 \u0441\u043f\u043e\u0441\u043e\u0431 \u043f\u043e\u043a\u0430\u0437\u0430\u0442\u044c \u0441\u0432\u043e\u0438 \u0442\u0435\u0441\u0442\u044b \u043a\u043e\u043b\u043b\u0435\u0433\u0430\u043c, \u043d\u0435 \u0437\u043d\u0430\u044e\u0449\u0438\u043c Python. \u0421\u043e\u0445\u0440\u0430\u043d\u0438\u0442\u044c \u0432\u0441\u0435 \u043f\u0440\u043e\u0432\u0435\u0440\u043a\u0438 \u0432 \u0444\u0430\u0439\u043b\u0435 YAML \u043c\u043e\u0436\u043d\u043e \u0441 \u043f\u043e\u043c\u043e\u0449\u044c\u044e \u043c\u0435\u0442\u043e\u0434\u0430 schema.to_yaml():<\/p>\n<pre><code class=\"python\">from pathlib import Path  # Get a YAML object yaml_schema = schema.to_yaml()  # Save to a file f = Path(\"schema.yml\") f.touch() f.write_text(yaml_schema)<\/code><\/pre>\n<p>\u0424\u0430\u0439\u043b schema.yml \u0434\u043e\u043b\u0436\u0435\u043d \u0432\u044b\u0433\u043b\u044f\u0434\u0435\u0442\u044c \u043f\u0440\u0438\u043c\u0435\u0440\u043d\u043e \u0442\u0430\u043a:<\/p>\n<pre><code class=\"yaml\">schema_type: dataframe version: 0.7.0 columns:   name:     dtype: str     nullable: false     checks:       isin:       - apple       - banana       - orange     allow_duplicates: true     coerce: false     required: true     regex: false   store:     dtype: str     nullable: true     checks:       isin:       - Aldi       - Walmart     allow_duplicates: false     coerce: false     required: true     regex: false   price:     dtype: int64     nullable: false     checks:       less_than: 5     allow_duplicates: true     coerce: false     required: true     regex: false checks: null index: null coerce: false strict: false<\/code><\/pre>\n<h2>\u0417\u0430\u0433\u0440\u0443\u0437\u043a\u0430 \u0438\u0437 YAML<\/h2>\n<p>\u0427\u0442\u043e\u0431\u044b \u0437\u0430\u0433\u0440\u0443\u0437\u0438\u0442\u044c \u0444\u0430\u0439\u043b, \u0438\u0441\u043f\u043e\u043b\u044c\u0437\u0443\u0439\u0442\u0435 pa.io.from_yaml(yaml_schema):<\/p>\n<pre><code class=\"python\">with f.open() as file:     yaml_schema = file.read()  schema = pa.io.from_yaml(yaml_schema)<\/code><\/pre>\n<h2>\u0417\u0430\u043a\u043b\u044e\u0447\u0435\u043d\u0438\u0435<\/h2>\n<p>\u041f\u043e\u0437\u0434\u0440\u0430\u0432\u043b\u044f\u044e! \u0412\u044b \u0442\u043e\u043b\u044c\u043a\u043e \u0447\u0442\u043e \u0443\u0437\u043d\u0430\u043b\u0438, \u043a\u0430\u043a \u0438\u0441\u043f\u043e\u043b\u044c\u0437\u043e\u0432\u0430\u0442\u044c Pandera \u0434\u043b\u044f \u043f\u0440\u043e\u0432\u0435\u0440\u043a\u0438 \u0432\u0430\u0448\u0435\u0433\u043e \u043d\u0430\u0431\u043e\u0440\u0430 \u0434\u0430\u043d\u043d\u044b\u0445. \u041f\u043e\u0441\u043a\u043e\u043b\u044c\u043a\u0443 \u0432 \u043d\u0430\u0443\u043a\u0435 \u043e \u0434\u0430\u043d\u043d\u044b\u0445 \u0434\u0430\u043d\u043d\u044b\u0435 \u044f\u0432\u043b\u044f\u044e\u0442\u0441\u044f \u0432\u0430\u0436\u043d\u044b\u043c \u0430\u0441\u043f\u0435\u043a\u0442\u043e\u043c \u043f\u0440\u043e\u0435\u043a\u0442\u0430, \u0432\u0430\u043b\u0438\u0434\u0430\u0446\u0438\u044f \u0432\u0445\u043e\u0434\u043d\u044b\u0445 \u0438 \u0432\u044b\u0445\u043e\u0434\u043d\u044b\u0445 \u0432\u0430\u0448\u0438\u0445 \u0444\u0443\u043d\u043a\u0446\u0438\u0439 \u043f\u043e\u0437\u0432\u043e\u043b\u0438\u0442 \u0441\u043e\u043a\u0440\u0430\u0442\u0438\u0442\u044c \u043a\u043e\u043b\u0438\u0447\u0435\u0441\u0442\u0432\u043e \u043e\u0448\u0438\u0431\u043e\u043a \u043d\u0430 \u0432\u0441\u0435\u0445 \u044d\u0442\u0430\u043f\u0430\u0445 \u0440\u0430\u0431\u043e\u0442\u044b. \u041d\u0435 \u0441\u0442\u0435\u0441\u043d\u044f\u0439\u0442\u0435\u0441\u044c <a href=\"https:\/\/github.com\/khuyentran1401\/Data-science\/blob\/master\/data_science_tools\/pandera_example\/pandera.ipynb\">\u0444\u043e\u0440\u043a\u0430\u0442\u044c<\/a> \u0438\u0441\u0445\u043e\u0434\u043d\u044b\u0439 \u043a\u043e\u0434 \u0434\u043b\u044f \u044d\u0442\u043e\u0439 \u0441\u0442\u0430\u0442\u044c\u0438. <\/p>\n<p>\u0410 \u043c\u044b \u043f\u043e\u043c\u043e\u0436\u0435\u043c \u0432\u0430\u043c \u043f\u0440\u043e\u043a\u0430\u0447\u0430\u0442\u044c \u043d\u0430\u0432\u044b\u043a\u0438 \u0438\u043b\u0438 \u0441 \u0441\u0430\u043c\u043e\u0433\u043e \u043d\u0430\u0447\u0430\u043b\u0430 \u043e\u0441\u0432\u043e\u0438\u0442\u044c \u043f\u0440\u043e\u0444\u0435\u0441\u0441\u0438\u044e, \u0432\u043e\u0441\u0442\u0440\u0435\u0431\u043e\u0432\u0430\u043d\u043d\u0443\u044e \u0432\u00a0\u043b\u044e\u0431\u043e\u0435 \u0432\u0440\u0435\u043c\u044f:<\/p>\n<ul>\n<li>\n<p><a href=\"https:\/\/skillfactory.ru\/data-scientist-pro?utm_source=habr&amp;utm_medium=habr&amp;utm_campaign=article&amp;utm_content=data-science_dspr_310322&amp;utm_term=conc\">\u041f\u0440\u043e\u0444\u0435\u0441\u0441\u0438\u044f Data Scientist<\/a><\/p>\n<\/li>\n<li>\n<p><a href=\"https:\/\/skillfactory.ru\/data-analyst-pro?utm_source=habr&amp;utm_medium=habr&amp;utm_campaign=article&amp;utm_content=analytics_dapr_310322&amp;utm_term=conc\">\u041f\u0440\u043e\u0444\u0435\u0441\u0441\u0438\u044f Data Analyst<\/a><\/p>\n<\/li>\n<\/ul>\n<p>\u0412\u044b\u0431\u0440\u0430\u0442\u044c \u0434\u0440\u0443\u0433\u0443\u044e <a href=\"https:\/\/skillfactory.ru\/catalogue?utm_source=habr&amp;utm_medium=habr&amp;utm_campaign=article&amp;utm_content=sf_allcourses_310322&amp;utm_term=conc\">\u0432\u043e\u0441\u0442\u0440\u0435\u0431\u043e\u0432\u0430\u043d\u043d\u0443\u044e \u043f\u0440\u043e\u0444\u0435\u0441\u0441\u0438\u044e<\/a>.<\/p>\n<figure class=\"full-width\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/027\/8b7\/117\/0278b711791c2128ff0dcc7c7fa65d6b.png\" width=\"1000\" height=\"200\" data-src=\"https:\/\/habrastorage.org\/getpro\/habr\/upload_files\/027\/8b7\/117\/0278b711791c2128ff0dcc7c7fa65d6b.png\"\/><figcaption><\/figcaption><\/figure>\n<details class=\"spoiler\">\n<summary>\u041a\u0440\u0430\u0442\u043a\u0438\u0439 \u043a\u0430\u0442\u0430\u043b\u043e\u0433 \u043a\u0443\u0440\u0441\u043e\u0432 \u0438 \u043f\u0440\u043e\u0444\u0435\u0441\u0441\u0438\u0439<\/summary>\n<div class=\"spoiler__content\">\n<p><strong>Data Science \u0438 Machine Learning<\/strong><\/p>\n<ul>\n<li>\n<p><a href=\"https:\/\/skillfactory.ru\/data-scientist-pro?utm_source=habr&amp;utm_medium=habr&amp;utm_campaign=article&amp;utm_content=data-science_dspr_310322&amp;utm_term=cat\">\u041f\u0440\u043e\u0444\u0435\u0441\u0441\u0438\u044f Data Scientist<\/a><\/p>\n<\/li>\n<li>\n<p><a href=\"https:\/\/skillfactory.ru\/data-analyst-pro?utm_source=habr&amp;utm_medium=habr&amp;utm_campaign=article&amp;utm_content=analytics_dapr_310322&amp;utm_term=cat\">\u041f\u0440\u043e\u0444\u0435\u0441\u0441\u0438\u044f Data Analyst<\/a><\/p>\n<\/li>\n<li>\n<p><a href=\"https:\/\/skillfactory.ru\/matematika-dlya-data-science#syllabus?utm_source=habr&amp;utm_medium=habr&amp;utm_campaign=article&amp;utm_content=data-science_mat_310322&amp;utm_term=cat\">\u041a\u0443\u0440\u0441 \u00ab\u041c\u0430\u0442\u0435\u043c\u0430\u0442\u0438\u043a\u0430 \u0434\u043b\u044f Data Science\u00bb<\/a><\/p>\n<\/li>\n<li>\n<p><a href=\"https:\/\/skillfactory.ru\/matematika-i-machine-learning-dlya-data-science?utm_source=habr&amp;utm_medium=habr&amp;utm_campaign=article&amp;utm_content=data-science_matml_310322&amp;utm_term=cat\">\u041a\u0443\u0440\u0441 \u00ab\u041c\u0430\u0442\u0435\u043c\u0430\u0442\u0438\u043a\u0430 \u0438 Machine Learning \u0434\u043b\u044f Data Science\u00bb<\/a><\/p>\n<\/li>\n<li>\n<p><a href=\"https:\/\/skillfactory.ru\/data-engineer?utm_source=habr&amp;utm_medium=habr&amp;utm_campaign=article&amp;utm_content=data-science_dea_310322&amp;utm_term=cat\">\u041a\u0443\u0440\u0441 \u043f\u043e Data Engineering<\/a><\/p>\n<\/li>\n<li>\n<p><a href=\"https:\/\/skillfactory.ru\/machine-learning-i-deep-learning?utm_source=habr&amp;utm_medium=habr&amp;utm_campaign=article&amp;utm_content=data-science_mldl_310322&amp;utm_term=cat\">\u041a\u0443\u0440\u0441 \u00abMachine Learning \u0438 Deep Learning\u00bb<\/a><\/p>\n<\/li>\n<li>\n<p><a href=\"https:\/\/skillfactory.ru\/machine-learning?utm_source=habr&amp;utm_medium=habr&amp;utm_campaign=article&amp;utm_content=data-science_ml_310322&amp;utm_term=cat\">\u041a\u0443\u0440\u0441 \u043f\u043e Machine Learning<\/a><\/p>\n<\/li>\n<\/ul>\n<p><strong>Python, \u0432\u0435\u0431-\u0440\u0430\u0437\u0440\u0430\u0431\u043e\u0442\u043a\u0430<\/strong><\/p>\n<ul>\n<li>\n<p><a href=\"https:\/\/skillfactory.ru\/python-fullstack-web-developer?utm_source=habr&amp;utm_medium=habr&amp;utm_campaign=article&amp;utm_content=coding_fpw_310322&amp;utm_term=cat\">\u041f\u0440\u043e\u0444\u0435\u0441\u0441\u0438\u044f Fullstack-\u0440\u0430\u0437\u0440\u0430\u0431\u043e\u0442\u0447\u0438\u043a \u043d\u0430 Python<\/a><\/p>\n<\/li>\n<li>\n<p><a href=\"https:\/\/skillfactory.ru\/python-for-web-developers?utm_source=habr&amp;utm_medium=habr&amp;utm_campaign=article&amp;utm_content=coding_pws_310322&amp;utm_term=cat\">\u041a\u0443\u0440\u0441 \u00abPython \u0434\u043b\u044f \u0432\u0435\u0431-\u0440\u0430\u0437\u0440\u0430\u0431\u043e\u0442\u043a\u0438\u00bb<\/a><\/p>\n<\/li>\n<li>\n<p><a href=\"https:\/\/skillfactory.ru\/frontend-razrabotchik?utm_source=habr&amp;utm_medium=habr&amp;utm_campaign=article&amp;utm_content=coding_fr_310322&amp;utm_term=cat\">\u041f\u0440\u043e\u0444\u0435\u0441\u0441\u0438\u044f Frontend-\u0440\u0430\u0437\u0440\u0430\u0431\u043e\u0442\u0447\u0438\u043a<\/a><\/p>\n<\/li>\n<li>\n<p><a href=\"https:\/\/skillfactory.ru\/webdev?utm_source=habr&amp;utm_medium=habr&amp;utm_campaign=article&amp;utm_content=coding_webdev_310322&amp;utm_term=cat\">\u041f\u0440\u043e\u0444\u0435\u0441\u0441\u0438\u044f \u0412\u0435\u0431-\u0440\u0430\u0437\u0440\u0430\u0431\u043e\u0442\u0447\u0438\u043a<\/a><\/p>\n<\/li>\n<\/ul>\n<p><strong>\u041c\u043e\u0431\u0438\u043b\u044c\u043d\u0430\u044f \u0440\u0430\u0437\u0440\u0430\u0431\u043e\u0442\u043a\u0430<\/strong><\/p>\n<ul>\n<li>\n<p><a href=\"https:\/\/skillfactory.ru\/ios-razrabotchik-s-nulya?utm_source=habr&amp;utm_medium=habr&amp;utm_campaign=article&amp;utm_content=coding_ios_310322&amp;utm_term=cat\">\u041f\u0440\u043e\u0444\u0435\u0441\u0441\u0438\u044f iOS-\u0440\u0430\u0437\u0440\u0430\u0431\u043e\u0442\u0447\u0438\u043a<\/a><\/p>\n<\/li>\n<li>\n<p><a href=\"https:\/\/skillfactory.ru\/android-razrabotchik?utm_source=habr&amp;utm_medium=habr&amp;utm_campaign=article&amp;utm_content=coding_andr_310322&amp;utm_term=cat\">\u041f\u0440\u043e\u0444\u0435\u0441\u0441\u0438\u044f Android-\u0440\u0430\u0437\u0440\u0430\u0431\u043e\u0442\u0447\u0438\u043a<\/a><\/p>\n<\/li>\n<\/ul>\n<p><strong>Java \u0438 C#<\/strong><\/p>\n<ul>\n<li>\n<p><a href=\"https:\/\/skillfactory.ru\/java-razrabotchik?utm_source=habr&amp;utm_medium=habr&amp;utm_campaign=article&amp;utm_content=coding_java_310322&amp;utm_term=cat\">\u041f\u0440\u043e\u0444\u0435\u0441\u0441\u0438\u044f Java-\u0440\u0430\u0437\u0440\u0430\u0431\u043e\u0442\u0447\u0438\u043a<\/a><\/p>\n<\/li>\n<li>\n<p><a href=\"https:\/\/skillfactory.ru\/java-qa-engineer-testirovshik-po?utm_source=habr&amp;utm_medium=habr&amp;utm_campaign=article&amp;utm_content=coding_qaja_310322&amp;utm_term=cat\">\u041f\u0440\u043e\u0444\u0435\u0441\u0441\u0438\u044f QA-\u0438\u043d\u0436\u0435\u043d\u0435\u0440 \u043d\u0430 JAVA<\/a><\/p>\n<\/li>\n<li>\n<p><a href=\"https:\/\/skillfactory.ru\/c-sharp-razrabotchik?utm_source=habr&amp;utm_medium=habr&amp;utm_campaign=article&amp;utm_content=coding_cdev_310322&amp;utm_term=cat\">\u041f\u0440\u043e\u0444\u0435\u0441\u0441\u0438\u044f C#-\u0440\u0430\u0437\u0440\u0430\u0431\u043e\u0442\u0447\u0438\u043a<\/a><\/p>\n<\/li>\n<li>\n<p><a href=\"https:\/\/skillfactory.ru\/game-razrabotchik-na-unity-i-c-sharp?utm_source=habr&amp;utm_medium=habr&amp;utm_campaign=article&amp;utm_content=coding_gamedev_310322&amp;utm_term=cat\">\u041f\u0440\u043e\u0444\u0435\u0441\u0441\u0438\u044f \u0420\u0430\u0437\u0440\u0430\u0431\u043e\u0442\u0447\u0438\u043a \u0438\u0433\u0440 \u043d\u0430 Unity<\/a><\/p>\n<\/li>\n<\/ul>\n<p><strong>\u041e\u0442 \u043e\u0441\u043d\u043e\u0432 \u2014 \u0432 \u0433\u043b\u0443\u0431\u0438\u043d\u0443<\/strong><\/p>\n<ul>\n<li>\n<p><a href=\"https:\/\/skillfactory.ru\/algoritmy-i-struktury-dannyh?utm_source=habr&amp;utm_medium=habr&amp;utm_campaign=article&amp;utm_content=coding_algo_310322&amp;utm_term=cat\">\u041a\u0443\u0440\u0441 \u00ab\u0410\u043b\u0433\u043e\u0440\u0438\u0442\u043c\u044b \u0438 \u0441\u0442\u0440\u0443\u043a\u0442\u0443\u0440\u044b \u0434\u0430\u043d\u043d\u044b\u0445\u00bb<\/a><\/p>\n<\/li>\n<li>\n<p><a href=\"https:\/\/skillfactory.ru\/c-plus-plus-razrabotchik?utm_source=habr&amp;utm_medium=habr&amp;utm_campaign=article&amp;utm_content=coding_cplus_310322&amp;utm_term=cat\">\u041f\u0440\u043e\u0444\u0435\u0441\u0441\u0438\u044f C++ \u0440\u0430\u0437\u0440\u0430\u0431\u043e\u0442\u0447\u0438\u043a<\/a><\/p>\n<\/li>\n<li>\n<p><a href=\"https:\/\/skillfactory.ru\/cyber-security-etichnij-haker?utm_source=habr&amp;utm_medium=habr&amp;utm_campaign=article&amp;utm_content=coding_hacker_310322&amp;utm_term=cat\">\u041f\u0440\u043e\u0444\u0435\u0441\u0441\u0438\u044f \u042d\u0442\u0438\u0447\u043d\u044b\u0439 \u0445\u0430\u043a\u0435\u0440<\/a><\/p>\n<\/li>\n<\/ul>\n<p><strong>\u0410 \u0442\u0430\u043a\u0436\u0435<\/strong><\/p>\n<ul>\n<li>\n<p><a href=\"https:\/\/skillfactory.ru\/devops-ingineer?utm_source=habr&amp;utm_medium=habr&amp;utm_campaign=article&amp;utm_content=coding_devops_310322&amp;utm_term=cat\">\u041a\u0443\u0440\u0441 \u043f\u043e DevOps<\/a><\/p>\n<\/li>\n<li>\n<p><a href=\"https:\/\/skillfactory.ru\/catalogue?utm_source=habr&amp;utm_medium=habr&amp;utm_campaign=article&amp;utm_content=sf_allcourses_310322&amp;utm_term=cat\">\u0412\u0441\u0435 \u043a\u0443\u0440\u0441\u044b<\/a><\/p>\n<\/li>\n<\/ul>\n<\/div>\n<\/details>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"v-portal\" style=\"display:none;\"><\/div>\n<\/div>\n<p> <!----> <!----><br \/> \u0441\u0441\u044b\u043b\u043a\u0430 \u043d\u0430 \u043e\u0440\u0438\u0433\u0438\u043d\u0430\u043b \u0441\u0442\u0430\u0442\u044c\u0438 <a href=\"https:\/\/habr.com\/ru\/company\/skillfactory\/blog\/658473\/\"> https:\/\/habr.com\/ru\/company\/skillfactory\/blog\/658473\/<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<div><\/div>\n<div id=\"post-content-body\">\n<div>\n<div class=\"article-formatted-body article-formatted-body_version-2\">\n<div xmlns=\"http:\/\/www.w3.org\/1999\/xhtml\">\n<h2>\u0423\u0431\u0435\u0434\u0438\u0442\u0435\u0441\u044c, \u0447\u0442\u043e \u0434\u0430\u043d\u043d\u044b\u0435 \u0441\u043e\u043e\u0442\u0432\u0435\u0442\u0441\u0442\u0432\u0443\u044e\u0442 \u043e\u0436\u0438\u0434\u0430\u043d\u0438\u044f\u043c<\/h2>\n<figure class=\"full-width\"><figcaption><\/figcaption><\/figure>\n<p>\u0412 \u043d\u0430\u0443\u043a\u0435 \u043e \u0434\u0430\u043d\u043d\u044b\u0445 \u0432\u0430\u0436\u043d\u043e \u0442\u0435\u0441\u0442\u0438\u0440\u043e\u0432\u0430\u0442\u044c \u043d\u0435 \u0442\u043e\u043b\u044c\u043a\u043e \u0444\u0443\u043d\u043a\u0446\u0438\u0438, \u043d\u043e \u0438 \u0434\u0430\u043d\u043d\u044b\u0435, \u0447\u0442\u043e\u0431\u044b \u0443\u0431\u0435\u0434\u0438\u0442\u044c\u0441\u044f, \u0447\u0442\u043e \u043e\u043d\u0438 \u0440\u0430\u0431\u043e\u0442\u0430\u044e\u0442 \u0442\u0430\u043a, \u043a\u0430\u043a \u0432\u044b \u043e\u0436\u0438\u0434\u0430\u043b\u0438. \u041c\u0430\u0442\u0435\u0440\u0438\u0430\u043b\u043e\u043c \u043e \u043f\u0440\u043e\u0441\u0442\u043e\u0439 \u0431\u0438\u0431\u043b\u0438\u043e\u0442\u0435\u043a\u0435 <a href=\"https:\/\/pandera.readthedocs.io\/en\/stable\/\">Pandera<\/a> \u0434\u043b\u044f \u0432\u0430\u043b\u0438\u0434\u0430\u0446\u0438\u0438 \u0444\u0440\u0435\u0439\u043c\u043e\u0432 \u0434\u0430\u043d\u043d\u044b\u0445 Pandas \u0434\u0435\u043b\u0438\u043c\u0441\u044f \u043a \u0441\u0442\u0430\u0440\u0442\u0443 \u0444\u043b\u0430\u0433\u043c\u0430\u043d\u0441\u043a\u043e\u0433\u043e <a href=\"https:\/\/skillfactory.ru\/data-scientist-pro?utm_source=habr&amp;utm_medium=habr&amp;utm_campaign=article&amp;utm_content=data-science_dspr_310322&amp;utm_term=lead\">\u043a\u0443\u0440\u0441\u0430 \u043f\u043e Data Science<\/a>.<\/p>\n<hr\/>\n<p>\u0427\u0442\u043e\u0431\u044b \u0443\u0441\u0442\u0430\u043d\u043e\u0432\u0438\u0442\u044c Pandera, \u0432 \u0442\u0435\u0440\u043c\u0438\u043d\u0430\u043b\u0435 \u043d\u0430\u0431\u0435\u0440\u0438\u0442\u0435:<\/p>\n<pre><code class=\"bash\">pip install pandera<\/code><\/pre>\n<h2>\u0412\u0432\u0435\u0434\u0435\u043d\u0438\u0435<\/h2>\n<p>\u041d\u0430\u0447\u043d\u0451\u043c \u0441 \u043f\u0440\u043e\u0441\u0442\u043e\u0433\u043e \u043d\u0430\u0431\u043e\u0440\u0430 \u0434\u0430\u043d\u043d\u044b\u0445, \u0447\u0442\u043e\u0431\u044b \u043f\u043e\u043d\u044f\u0442\u044c, \u043a\u0430\u043a \u0440\u0430\u0431\u043e\u0442\u0430\u0435\u0442 Pandera:<\/p>\n<pre><code class=\"python\">import pandas as pd  fruits = pd.DataFrame(     {         \"name\": [\"apple\", \"banana\", \"apple\", \"orange\"],         \"store\": [\"Aldi\", \"Walmart\", \"Walmart\", \"Aldi\"],         \"price\": [2, 1, 3, 4],     } )  fruits<\/code><\/pre>\n<figure class=\"\"><figcaption><\/figcaption><\/figure>\n<p>\u041f\u0440\u0435\u0434\u0441\u0442\u0430\u0432\u044c\u0442\u0435: \u0432\u0430\u0448 \u043c\u0435\u043d\u0435\u0434\u0436\u0435\u0440 \u0441\u043a\u0430\u0437\u0430\u043b \u0432\u0430\u043c, \u0447\u0442\u043e \u0432 \u043d\u0430\u0431\u043e\u0440\u0435 \u0434\u0430\u043d\u043d\u044b\u0445 \u043c\u043e\u0433\u0443\u0442 \u0445\u0440\u0430\u043d\u0438\u0442\u044c\u0441\u044f \u0442\u043e\u043b\u044c\u043a\u043e \u043e\u043f\u0440\u0435\u0434\u0435\u043b\u0451\u043d\u043d\u044b\u0435 \u0444\u0440\u0443\u043a\u0442\u044b, \u0430 \u0437\u043d\u0430\u0447\u0435\u043d\u0438\u0435 \u0438\u0445 \u0446\u0435\u043d\u044b \u0434\u043e\u043b\u0436\u043d\u043e \u0431\u044b\u0442\u044c \u043c\u0435\u043d\u044c\u0448\u0435 4:<\/p>\n<pre><code class=\"python\">available_fruits = [\"apple\", \"banana\", \"orange\"] nearby_stores = [\"Aldi\", \"Walmart\"]<\/code><\/pre>\n<p>\u041f\u0440\u043e\u0432\u0435\u0440\u043a\u0430 \u0434\u0430\u043d\u043d\u044b\u0445 \u0432\u0440\u0443\u0447\u043d\u0443\u044e \u043c\u043e\u0436\u0435\u0442 \u0437\u0430\u043d\u044f\u0442\u044c \u043c\u043d\u043e\u0433\u043e \u0432\u0440\u0435\u043c\u0435\u043d\u0438, \u043e\u0441\u043e\u0431\u0435\u043d\u043d\u043e \u043a\u043e\u0433\u0434\u0430 \u0438\u0445 \u043c\u043d\u043e\u0433\u043e. \u0415\u0441\u0442\u044c \u043b\u0438 \u0441\u043f\u043e\u0441\u043e\u0431 \u0430\u0432\u0442\u043e\u043c\u0430\u0442\u0438\u0437\u0438\u0440\u043e\u0432\u0430\u0442\u044c \u043f\u0440\u043e\u0432\u0435\u0440\u043a\u0443? \u0414\u0430, \u0437\u0434\u0435\u0441\u044c \u0438 \u043f\u0440\u0438\u0433\u043e\u0434\u0438\u0442\u0441\u044f Pandera:<\/p>\n<ul>\n<li>\n<p>\u0441\u043e\u0437\u0434\u0430\u0434\u0438\u043c \u0442\u0435\u0441\u0442\u044b \u0432\u0441\u0435\u0433\u043e \u043d\u0430\u0431\u043e\u0440\u0430 \u0434\u0430\u043d\u043d\u044b\u0445 \u0441 \u043f\u043e\u043c\u043e\u0449\u044c\u044e DataFrameSchema;<\/p>\n<\/li>\n<li>\n<p>\u0442\u0435\u0441\u0442\u044b \u0434\u043b\u044f \u043a\u0430\u0436\u0434\u043e\u0439 \u043a\u043e\u043b\u043e\u043d\u043a\u0438 \u2014 \u043f\u0440\u0438 \u043f\u043e\u043c\u043e\u0449\u0438 Column;<\/p>\n<\/li>\n<li>\n<p>\u0442\u0438\u043f \u0442\u0435\u0441\u0442\u0430 \u043e\u043f\u0440\u0435\u0434\u0435\u043b\u0438\u043c \u043f\u0440\u0438 \u043f\u043e\u043c\u043e\u0449\u0438 Check.<\/p>\n<\/li>\n<\/ul>\n<pre><code class=\"python\">import pandera as pa from pandera import Column, Check  schema = pa.DataFrameSchema(     {         \"name\": Column(str, Check.isin(available_fruits)),         \"store\": Column(str, Check.isin(nearby_stores)),         \"price\": Column(int, Check.less_than(4)),     } ) schema.validate(fruits)<\/code><\/pre>\n<pre><code class=\"bash\">SchemaError: &lt;Schema Column(name=price, type=DataType(int64))> failed element-wise validator 0: &lt;Check less_than: less_than(4)> failure cases:    index  failure_case 0      3             4<\/code><\/pre>\n<p>\u041f\u043e\u044f\u0441\u043d\u044e \u044d\u0442\u043e\u0442 \u043a\u043e\u0434:<\/p>\n<ul>\n<li>\n<p><code>\"name\": Column(str, Check.isin(available_fruits))<\/code> \u043f\u0440\u043e\u0432\u0435\u0440\u044f\u0435\u0442, \u0438\u043c\u0435\u0435\u0442 \u043b\u0438 \u0441\u0442\u043e\u043b\u0431\u0435\u0446 name \u0442\u0438\u043f string \u0438 \u0432\u0441\u0435 \u043b\u0438 \u0437\u043d\u0430\u0447\u0435\u043d\u0438\u044f \u0441\u0442\u043e\u043b\u0431\u0446\u0430 name \u043d\u0430\u0445\u043e\u0434\u044f\u0442\u0441\u044f \u0432\u043d\u0443\u0442\u0440\u0438 \u0443\u043a\u0430\u0437\u0430\u043d\u043d\u043e\u0433\u043e \u0441\u043f\u0438\u0441\u043a\u0430;<\/p>\n<\/li>\n<li>\n<p><code>\"price\": Column(int, Check.less_than(4))<\/code> \u043f\u0440\u043e\u0432\u0435\u0440\u044f\u0435\u0442, \u0432\u0441\u0435 \u043b\u0438 \u0437\u043d\u0430\u0447\u0435\u043d\u0438\u044f \u0432 \u0441\u0442\u043e\u043b\u0431\u0446\u0435 price \u0438\u043c\u0435\u044e\u0442 \u0442\u0438\u043f int \u0438 \u043c\u0435\u043d\u044c\u0448\u0435 4;<\/p>\n<\/li>\n<li>\n<p>\u043d\u0435 \u0432\u0441\u0435 \u0437\u043d\u0430\u0447\u0435\u043d\u0438\u044f \u0432 \u0441\u0442\u043e\u043b\u0431\u0446\u0435 price \u043c\u0435\u043d\u044c\u0448\u0435 4, \u043f\u043e\u044d\u0442\u043e\u043c\u0443 \u0442\u0435\u0441\u0442 \u043d\u0435 \u043f\u0440\u043e\u0445\u043e\u0434\u0438\u0442.<\/p>\n<\/li>\n<\/ul>\n<p>\u0414\u0440\u0443\u0433\u0438\u0435 \u0432\u0441\u0442\u0440\u043e\u0435\u043d\u043d\u044b\u0435 \u043c\u0435\u0442\u043e\u0434\u044b Checks \u0432\u044b \u043d\u0430\u0439\u0434\u0451\u0442\u0435 <a href=\"https:\/\/pandera.readthedocs.io\/en\/stable\/reference\/generated\/pandera.checks.Check.html#pandera-checks-check\">\u0437\u0434\u0435\u0441\u044c<\/a>.<\/p>\n<h3>\u041d\u0430\u0441\u0442\u0440\u0430\u0438\u0432\u0430\u0435\u043c\u044b\u0435 \u043f\u0440\u043e\u0432\u0435\u0440\u043a\u0438<\/h3>\n<p>\u041f\u0440\u043e\u0432\u0435\u0440\u043a\u0438 \u043c\u043e\u0436\u043d\u043e \u043f\u0438\u0441\u0430\u0442\u044c \u0438 \u0447\u0435\u0440\u0435\u0437 \u043b\u044f\u043c\u0431\u0434\u0430-\u0432\u044b\u0440\u0430\u0436\u0435\u043d\u0438\u044f. \u0412 \u043a\u043e\u0434\u0435 \u043d\u0438\u0436\u0435 Check(lambda price: sum(price) &lt; 20) \u043f\u0440\u043e\u0432\u0435\u0440\u044f\u0435\u0442, \u043c\u0435\u043d\u044c\u0448\u0435 \u043b\u0438 20 \u0441\u0443\u043c\u043c\u0430 \u0432 price.<\/p>\n<pre><code class=\"python\">schema = pa.DataFrameSchema(     {         \"name\": Column(str, Check.isin(available_fruits)),         \"store\": Column(str, Check.isin(nearby_stores)),         \"price\": Column(             int, [Check.less_than(5), Check(lambda price: sum(price) &lt; 20)]         ),     } ) schema.validate(fruits)<\/code><\/pre>\n<h2>SchemaModel<\/h2>\n<p>\u041a\u043e\u0433\u0434\u0430 \u0442\u0435\u0441\u0442\u044b \u0441\u043b\u043e\u0436\u043d\u044b\u0435, \u0447\u0438\u0449\u0435 \u043a\u043e\u0434 \u0441\u0434\u0435\u043b\u0430\u044e\u0442 \u043d\u0435 \u0441\u043b\u043e\u0432\u0430\u0440\u0438, \u0430 \u043a\u043b\u0430\u0441\u0441\u044b \u0434\u0430\u043d\u043d\u044b\u0445. \u041a \u0441\u0447\u0430\u0441\u0442\u044c\u044e, Pandera \u043f\u043e\u0437\u0432\u043e\u043b\u044f\u0435\u0442 \u0441\u043e\u0437\u0434\u0430\u0432\u0430\u0442\u044c \u0442\u0435\u0441\u0442\u044b \u0441 \u043a\u043b\u0430\u0441\u0441\u0430\u043c\u0438 \u0434\u0430\u043d\u043d\u044b\u0445.<\/p>\n<pre><code class=\"python\">from pandera.typing import Series  class Schema(pa.SchemaModel):     name: Series[str] = pa.Field(isin=available_fruits)     store: Series[str] = pa.Field(isin=nearby_stores)     price: Series[int] = pa.Field(le=5)      @pa.check(\"price\")     def price_sum_lt_20(cls, price: Series[int]) -> Series[bool]:         return sum(price) &lt; 20  Schema.validate(fruits)<\/code><\/pre>\n<h2>\u0414\u0435\u043a\u043e\u0440\u0430\u0442\u043e\u0440 \u0432\u0430\u043b\u0438\u0434\u0430\u0446\u0438\u0438<\/h2>\n<h3>\u041f\u0440\u043e\u0432\u0435\u0440\u043a\u0430 \u0432\u0432\u043e\u0434\u0430<\/h3>\n<p>\u041a\u0430\u043a \u0442\u0435\u0441\u0442\u0438\u0440\u043e\u0432\u0430\u0442\u044c \u0432\u0445\u043e\u0434\u043d\u044b\u0435 \u0437\u043d\u0430\u0447\u0435\u043d\u0438\u044f \u0444\u0443\u043d\u043a\u0446\u0438\u0438? \u041f\u0440\u044f\u043c\u043e\u043b\u0438\u043d\u0435\u0439\u043d\u044b\u0439 \u043f\u043e\u0434\u0445\u043e\u0434 \u2014 \u0434\u043e\u0431\u0430\u0432\u0438\u0442\u044c schema.validate(input) \u043f\u0440\u044f\u043c\u043e \u0432 \u0444\u0443\u043d\u043a\u0446\u0438\u044e:<\/p>\n<pre><code class=\"python\">fruits = pd.DataFrame(     {         \"name\": [\"apple\", \"banana\", \"apple\", \"orange\"],         \"store\": [\"Aldi\", \"Walmart\", \"Walmart\", \"Aldi\"],         \"price\": [2, 1, 3, 4],     } )  schema = pa.DataFrameSchema(     {         \"name\": Column(str, Check.isin(available_fruits)),         \"store\": Column(str, Check.isin(nearby_stores)),         \"price\": Column(int, Check.less_than(5)),     } )   def get_total_price(fruits: pd.DataFrame, schema: pa.DataFrameSchema):     validated = schema.validate(fruits)     return validated[\"price\"].sum()   get_total_price(fruits, schema)<\/code><\/pre>\n<p>\u041d\u043e \u043e\u043d \u043e\u0441\u043b\u043e\u0436\u043d\u044f\u0435\u0442 \u0442\u0435\u0441\u0442\u0438\u0440\u043e\u0432\u0430\u043d\u0438\u0435. \u0424\u0443\u043d\u043a\u0446\u0438\u044f get_total_price \u0438\u043c\u0435\u0435\u0442 \u0430\u0440\u0433\u0443\u043c\u0435\u043d\u0442\u044b fruits and schema, \u0430 \u0437\u043d\u0430\u0447\u0438\u0442, \u0432 \u0442\u0435\u0441\u0442 \u0444\u0443\u043d\u043a\u0446\u0438\u0438 \u043d\u0443\u0436\u043d\u043e \u0432\u043a\u043b\u044e\u0447\u0438\u0442\u044c \u043e\u0431\u0430:<\/p>\n<pre><code class=\"bash\">def test_get_total_price():     fruits = pd.DataFrame({'name': ['apple', 'banana'], 'store': ['Aldi', 'Walmart'], 'price': [1, 2]})          # Need to include schema in the unit test     schema = pa.DataFrameSchema(         {             \"name\": Column(str, Check.isin(available_fruits)),             \"store\": Column(str, Check.isin(nearby_stores)),             \"price\": Column(int, Check.less_than(5)),         }     )     assert get_total_price(fruits, schema) == 3<\/code><\/pre>\n<p>\u0424\u0443\u043d\u043a\u0446\u0438\u044f test_get_total_price \u043f\u0440\u043e\u0432\u0435\u0440\u044f\u0435\u0442 \u0438 \u0434\u0430\u043d\u043d\u044b\u0435, \u0438 \u0444\u0443\u043d\u043a\u0446\u0438\u044e. \u041c\u043e\u0434\u0443\u043b\u044c\u043d\u044b\u0439 \u0442\u0435\u0441\u0442 \u0434\u043e\u043b\u0436\u0435\u043d \u043f\u0440\u043e\u0432\u0435\u0440\u044f\u0442\u044c \u0442\u043e\u043b\u044c\u043a\u043e \u043e\u0434\u043d\u0443 \u0432\u0435\u0449\u044c, \u043f\u043e\u044d\u0442\u043e\u043c\u0443 \u0432\u043a\u043b\u044e\u0447\u0435\u043d\u0438\u0435 \u043f\u0440\u043e\u0432\u0435\u0440\u043a\u0438 \u0434\u0430\u043d\u043d\u044b\u0445 \u0432\u043d\u0443\u0442\u0440\u0438 \u0444\u0443\u043d\u043a\u0446\u0438\u0438 \u2014 \u043d\u0435 \u0438\u0434\u0435\u0430\u043b\u044c\u043d\u043e\u0435 \u0440\u0435\u0448\u0435\u043d\u0438\u0435.<\/p>\n<p>\u042d\u0442\u0443 \u043f\u0440\u043e\u0431\u043b\u0435\u043c\u0443 Pandera \u0440\u0435\u0448\u0430\u0435\u0442 \u0434\u0435\u043a\u043e\u0440\u0430\u0442\u043e\u0440\u043e\u043c check_input. \u0410\u0440\u0433\u0443\u043c\u0435\u043d\u0442 \u0434\u0435\u043a\u043e\u0440\u0430\u0442\u043e\u0440\u0430 \u043f\u0440\u0438\u043c\u0435\u043d\u044f\u0435\u0442\u0441\u044f \u0432 \u0432\u0430\u043b\u0438\u0434\u0430\u0446\u0438\u0438 \u0432\u0445\u043e\u0434\u043d\u044b\u0445 \u0437\u043d\u0430\u0447\u0435\u043d\u0438\u0439:<\/p>\n<pre><code class=\"python\">from pandera import check_input  @check_input(schema) def get_total_price(fruits: pd.DataFrame):     return fruits.price.sum()  get_total_price(fruits)<\/code><\/pre>\n<p>\u0415\u0441\u043b\u0438 \u0432\u0445\u043e\u0434\u043d\u043e\u0435 \u0437\u043d\u0430\u0447\u0435\u043d\u0438\u0435 \u043d\u0435\u043a\u043e\u0440\u0440\u0435\u043a\u0442\u043d\u043e, Pandera \u043f\u043e\u0434\u043d\u0438\u043c\u0430\u0435\u0442 \u0438\u0441\u043a\u043b\u044e\u0447\u0435\u043d\u0438\u0435 \u0434\u043e \u043e\u0431\u0440\u0430\u0431\u043e\u0442\u043a\u0438 \u0437\u043d\u0430\u0447\u0435\u043d\u0438\u044f \u0432 \u0444\u0443\u043d\u043a\u0446\u0438\u0438:<\/p>\n<pre><code class=\"python\">fruits = pd.DataFrame(     {         \"name\": [\"apple\", \"banana\", \"apple\", \"orange\"],         \"store\": [\"Aldi\", \"Walmart\", \"Walmart\", \"Aldi\"],         \"price\": [\"2\", \"1\", \"3\", \"4\"],     } )  @check_input(schema) def get_total_price(fruits: pd.DataFrame):     return fruits.price.sum()  get_total_price(fruits)<\/code><\/pre>\n<pre><code class=\"bash\">SchemaError: error in check_input decorator of function 'get_total_price': expected series 'price' to have type int64, got object<\/code><\/pre>\n<p>\u0422\u0430\u043a\u0430\u044f \u043f\u0440\u043e\u0432\u0435\u0440\u043a\u0430 \u0434\u043e \u043e\u0431\u0440\u0430\u0431\u043e\u0442\u043a\u0438 \u0432 \u0444\u0443\u043d\u043a\u0446\u0438\u0438 \u044d\u043a\u043e\u043d\u043e\u043c\u0438\u0442 \u043c\u043d\u043e\u0433\u043e \u0432\u0440\u0435\u043c\u0435\u043d\u0438.<\/p>\n<h3>\u041f\u0440\u043e\u0432\u0435\u0440\u043a\u0430 \u0432\u044b\u0432\u043e\u0434\u0430<\/h3>\n<p>\u0414\u043b\u044f \u043f\u0440\u043e\u0432\u0435\u0440\u043a\u0438 \u0432\u044b\u0432\u043e\u0434\u0430 \u043c\u043e\u0436\u043d\u043e \u0438\u0441\u043f\u043e\u043b\u044c\u0437\u043e\u0432\u0430\u0442\u044c \u0434\u0435\u043a\u043e\u0440\u0430\u0442\u043e\u0440 check_output:<\/p>\n<pre><code class=\"python\">from pandera import check_output  fruits_nearby = pd.DataFrame(     {         \"name\": [\"apple\", \"banana\", \"apple\", \"orange\"],         \"store\": [\"Aldi\", \"Walmart\", \"Walmart\", \"Aldi\"],         \"price\": [2, 1, 3, 4],     } )  fruits_faraway = pd.DataFrame(     {         \"name\": [\"apple\", \"banana\", \"apple\", \"orange\"],         \"store\": [\"Whole Foods\", \"Whole Foods\", \"Schnucks\", \"Schnucks\"],         \"price\": [3, 2, 4, 5],     } )  out_schema = pa.DataFrameSchema(     {\"store\": Column(str, Check.isin([\"Aldi\", \"Walmart\", \"Whole Foods\", \"Schnucks\"]))} )   @check_output(out_schema) def combine_fruits(fruits_nearby: pd.DataFrame, fruits_faraway: pd.DataFrame):     fruits = pd.concat([fruits_nearby, fruits_faraway])     return fruits   combine_fruits(fruits_nearby, fruits_faraway)<\/code><\/pre>\n<h3>\u041f\u0440\u043e\u0432\u0435\u0440\u043a\u0430 \u0432\u0432\u043e\u0434\u0430 \u0438 \u0432\u044b\u0432\u043e\u0434\u0430<\/h3>\n<p>\u041f\u0440\u043e\u0432\u0435\u0440\u0438\u0442\u044c \u0432\u0445\u043e\u0434\u043d\u044b\u0435 \u0438 \u0432\u044b\u0445\u043e\u0434\u043d\u044b\u0435 \u0434\u0430\u043d\u043d\u044b\u0435 \u043c\u043e\u0436\u043d\u043e \u0441 \u043f\u043e\u043c\u043e\u0449\u044c\u044e \u0434\u0435\u043a\u043e\u0440\u0430\u0442\u043e\u0440\u0430 check_io:<\/p>\n<pre><code class=\"python\">from pandera import check_io  in_schema = pa.DataFrameSchema({\"store\": Column(str)})  out_schema = pa.DataFrameSchema(     {\"store\": Column(str, Check.isin([\"Aldi\", \"Walmart\", \"Whole Foods\", \"Schnucks\"]))} )   @check_io(fruits_nearby=in_schema, fruits_faraway=in_schema, out=out_schema) def combine_fruits(fruits_nearby: pd.DataFrame, fruits_faraway: pd.DataFrame):     fruits = pd.concat([fruits_nearby, fruits_faraway])     return fruits   combine_fruits(fruits_nearby, fruits_faraway)<\/code><\/pre>\n<h2>\u0414\u0440\u0443\u0433\u0438\u0435 \u0430\u0440\u0433\u0443\u043c\u0435\u043d\u0442\u044b \u043f\u0440\u043e\u0432\u0435\u0440\u043a\u0438 \u0441\u0442\u043e\u043b\u0431\u0446\u043e\u0432<\/h2>\n<h3>Null<\/h3>\n<p>\u041f\u043e \u0443\u043c\u043e\u043b\u0447\u0430\u043d\u0438\u044e Pandera \u0432\u044b\u0434\u0430\u0451\u0442 \u043e\u0448\u0438\u0431\u043a\u0443, \u0435\u0441\u043b\u0438 \u0432 \u043f\u0440\u043e\u0432\u0435\u0440\u044f\u0435\u043c\u043e\u043c \u0441\u0442\u043e\u043b\u0431\u0446\u0435 \u0435\u0441\u0442\u044c Null. \u0415\u0441\u043b\u0438 \u043d\u0443\u043b\u0435\u0432\u044b\u0435 \u0437\u043d\u0430\u0447\u0435\u043d\u0438\u044f \u0434\u043e\u043f\u0443\u0441\u0442\u0438\u043c\u044b, \u0432 \u043a\u043b\u0430\u0441\u0441 Column \u0434\u043e\u0431\u0430\u0432\u044c\u0442\u0435 nullable=True:<\/p>\n<pre><code class=\"python\">import numpy as np  fruits = fruits = pd.DataFrame(     {         \"name\": [\"apple\", \"banana\", \"apple\", \"orange\"],         \"store\": [\"Aldi\", \"Walmart\", \"Walmart\", np.nan],         \"price\": [2, 1, 3, 4],     } )  schema = pa.DataFrameSchema(     {         \"name\": Column(str, Check.isin(available_fruits)),         \"store\": Column(str, Check.isin(nearby_stores), nullable=True),         \"price\": Column(int, Check.less_than(5)),     } ) schema.validate(fruits)<\/code><\/pre>\n<h3>\u0414\u0443\u0431\u043b\u0438\u043a\u0430\u0442\u044b<\/h3>\n<p>\u041f\u043e \u0443\u043c\u043e\u043b\u0447\u0430\u043d\u0438\u044e \u0434\u0443\u0431\u043b\u0438\u043a\u0430\u0442\u044b \u0434\u043e\u043f\u0443\u0441\u0442\u0438\u043c\u044b. \u0427\u0442\u043e\u0431\u044b \u043e\u043d\u0438 \u043f\u043e\u0434\u043d\u0438\u043c\u0430\u043b\u0438 \u0438\u0441\u043a\u043b\u044e\u0447\u0435\u043d\u0438\u0435, \u0434\u043e\u0431\u0430\u0432\u044c\u0442\u0435 \u0430\u0440\u0433\u0443\u043c\u0435\u043d\u0442 allow_duplicates=False:<\/p>\n<pre><code class=\"python\">schema = pa.DataFrameSchema(     {         \"name\": Column(str, Check.isin(available_fruits)),         \"store\": Column(             str, Check.isin(nearby_stores), nullable=True, allow_duplicates=False         ),         \"price\": Column(int, Check.less_than(5)),     } ) schema.validate(fruits)<\/code><\/pre>\n<pre><code class=\"bash\">SchemaError: series 'store' contains duplicate values: {2: 'Walmart'}<\/code><\/pre>\n<h3>\u041f\u0440\u0435\u043e\u0431\u0440\u0430\u0437\u043e\u0432\u0430\u043d\u0438\u0435 \u0442\u0438\u043f\u043e\u0432 \u0434\u0430\u043d\u043d\u044b\u0445<\/h3>\n<p>\u0410\u0440\u0433\u0443\u043c\u0435\u043d\u0442 coerce=True \u0438\u0437\u043c\u0435\u043d\u044f\u0435\u0442 \u0442\u0438\u043f \u0434\u0430\u043d\u043d\u044b\u0445 \u0441\u0442\u043e\u043b\u0431\u0446\u0430, \u0435\u0441\u043b\u0438 \u0442\u0438\u043f \u043d\u0435 \u0443\u0434\u043e\u0432\u043b\u0435\u0442\u0432\u043e\u0440\u044f\u0435\u0442 \u0443\u0441\u043b\u043e\u0432\u0438\u044e \u043f\u0440\u043e\u0432\u0435\u0440\u043a\u0438.<\/p>\n<p>\u0412 \u043a\u043e\u0434\u0435 \u043d\u0438\u0436\u0435 \u0442\u0438\u043f \u0434\u0430\u043d\u043d\u044b\u0445 \u0446\u0435\u043d\u044b \u0438\u0437\u043c\u0435\u043d\u0451\u043d \u0441 \u0446\u0435\u043b\u043e\u0433\u043e \u043d\u0430 \u0441\u0442\u0440\u043e\u043a\u0443:<\/p>\n<pre><code class=\"python\">fruits = pd.DataFrame(     {         \"name\": [\"apple\", \"banana\", \"apple\", \"orange\"],         \"store\": [\"Aldi\", \"Walmart\", \"Walmart\", \"Aldi\"],         \"price\": [2, 1, 3, 4],     } )  schema = pa.DataFrameSchema({\"price\": Column(str, coerce=True)}) validated = schema.validate(fruits) validated.dtypes<\/code><\/pre>\n<pre><code class=\"bash\">name     object store    object price    object dtype: object<\/code><\/pre>\n<h2>\u0421\u043e\u043f\u043e\u0441\u0442\u0430\u0432\u043b\u0435\u043d\u0438\u0435 \u0448\u0430\u0431\u043b\u043e\u043d\u043e\u0432<\/h2>\n<p>\u0427\u0442\u043e, \u0435\u0441\u043b\u0438 \u043c\u044b \u0445\u043e\u0442\u0438\u043c \u0438\u0437\u043c\u0435\u043d\u0438\u0442\u044c \u0432\u0441\u0435 \u0441\u0442\u043e\u043b\u0431\u0446\u044b, \u043a\u043e\u0442\u043e\u0440\u044b\u0435 \u043d\u0430\u0447\u0438\u043d\u0430\u044e\u0442\u0441\u044f \u0441\u043e \u0441\u043b\u043e\u0432\u0430 store?<\/p>\n<pre><code class=\"python\">favorite_stores = [\"Aldi\", \"Walmart\", \"Whole Foods\", \"Schnucks\"]  fruits = pd.DataFrame(     {         \"name\": [\"apple\", \"banana\", \"apple\", \"orange\"],         \"store_nearby\": [\"Aldi\", \"Walmart\", \"Walmart\", \"Aldi\"],         \"store_far\": [\"Whole Foods\", \"Schnucks\", \"Whole Foods\", \"Schnucks\"],     } )<\/code><\/pre>\n<p>Pandera \u043f\u043e\u0437\u0432\u043e\u043b\u044f\u0435\u0442 \u043d\u0430\u043c \u043f\u0440\u0438\u043c\u0435\u043d\u044f\u0442\u044c \u043e\u0434\u043d\u0438 \u0438 \u0442\u0435 \u0436\u0435 \u043f\u0440\u043e\u0432\u0435\u0440\u043a\u0438 \u043a \u043d\u0435\u0441\u043a\u043e\u043b\u044c\u043a\u0438\u043c \u0441\u0442\u043e\u043b\u0431\u0446\u0430\u043c \u0441 \u043e\u043f\u0440\u0435\u0434\u0435\u043b\u0451\u043d\u043d\u044b\u043c \u0448\u0430\u0431\u043b\u043e\u043d\u043e\u043c, \u0432\u043e\u0442 \u0442\u0430\u043a: regex=True:<\/p>\n<pre><code class=\"python\">schema = pa.DataFrameSchema(     {         \"name\": Column(str, Check.isin(available_fruits)),         \"store_+\": Column(str, Check.isin(favorite_stores), regex=True),     } ) schema.validate(fruits)<\/code><\/pre>\n<h2>\u042d\u043a\u0441\u043f\u043e\u0440\u0442 \u0438 \u0437\u0430\u0433\u0440\u0443\u0437\u043a\u0430 \u0438\u0437 \u0444\u0430\u0439\u043b\u0430 YAML<\/h2>\n<h3>\u042d\u043a\u0441\u043f\u043e\u0440\u0442 \u0432 YAML<\/h3>\n<p>YAML \u2014 \u043e\u0442\u043b\u0438\u0447\u043d\u044b\u0439 \u0441\u043f\u043e\u0441\u043e\u0431 \u043f\u043e\u043a\u0430\u0437\u0430\u0442\u044c \u0441\u0432\u043e\u0438 \u0442\u0435\u0441\u0442\u044b \u043a\u043e\u043b\u043b\u0435\u0433\u0430\u043c, \u043d\u0435 \u0437\u043d\u0430\u044e\u0449\u0438\u043c Python. \u0421\u043e\u0445\u0440\u0430\u043d\u0438\u0442\u044c \u0432\u0441\u0435 \u043f\u0440\u043e\u0432\u0435\u0440\u043a\u0438 \u0432 \u0444\u0430\u0439\u043b\u0435 YAML \u043c\u043e\u0436\u043d\u043e \u0441 \u043f\u043e\u043c\u043e\u0449\u044c\u044e \u043c\u0435\u0442\u043e\u0434\u0430 schema.to_yaml():<\/p>\n<pre><code class=\"python\">from pathlib import Path  # Get a YAML object yaml_schema = schema.to_yaml()  # Save to a file f = Path(\"schema.yml\") f.touch() f.write_text(yaml_schema)<\/code><\/pre>\n<p>\u0424\u0430\u0439\u043b schema.yml \u0434\u043e\u043b\u0436\u0435\u043d \u0432\u044b\u0433\u043b\u044f\u0434\u0435\u0442\u044c \u043f\u0440\u0438\u043c\u0435\u0440\u043d\u043e \u0442\u0430\u043a:<\/p>\n<pre><code class=\"yaml\">schema_type: dataframe version: 0.7.0 columns:   name:     dtype: str     nullable: false     checks:       isin:       - apple       - banana       - orange     allow_duplicates: true     coerce: false     required: true     regex: false   store:     dtype: str     nullable: true     checks:       isin:       - Aldi       - Walmart     allow_duplicates: false     coerce: false     required: true     regex: false   price:     dtype: int64   <\/code><\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[],"tags":[],"class_list":["post-331341","post","type-post","status-publish","format-standard","hentry"],"_links":{"self":[{"href":"https:\/\/savepearlharbor.com\/index.php?rest_route=\/wp\/v2\/posts\/331341","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/savepearlharbor.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/savepearlharbor.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/savepearlharbor.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/savepearlharbor.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=331341"}],"version-history":[{"count":0,"href":"https:\/\/savepearlharbor.com\/index.php?rest_route=\/wp\/v2\/posts\/331341\/revisions"}],"wp:attachment":[{"href":"https:\/\/savepearlharbor.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=331341"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/savepearlharbor.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=331341"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/savepearlharbor.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=331341"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}