{"id":452827,"date":"2025-03-23T15:04:54","date_gmt":"2025-03-23T15:04:54","guid":{"rendered":"http:\/\/savepearlharbor.com\/?p=452827"},"modified":"-0001-11-30T00:00:00","modified_gmt":"-0001-11-29T21:00:00","slug":"","status":"publish","type":"post","link":"https:\/\/savepearlharbor.com\/?p=452827","title":{"rendered":"<span>Mastering Data Lifecycle Management: ILM in Postgres Pro Enterprise 17<\/span>"},"content":{"rendered":"<div><!--[--><!--]--><\/div>\n<div id=\"post-content-body\">\n<div>\n<div class=\"article-formatted-body article-formatted-body article-formatted-body_version-2\">\n<div xmlns=\"http:\/\/www.w3.org\/1999\/xhtml\">\n<figure class=\"full-width\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/habrastorage.org\/r\/w780q1\/getpro\/habr\/upload_files\/149\/3dd\/31c\/1493dd31c1a669a173d53b4cd5a19ac4.jpg\" width=\"3250\" height=\"1667\" data-src=\"https:\/\/habrastorage.org\/getpro\/habr\/upload_files\/149\/3dd\/31c\/1493dd31c1a669a173d53b4cd5a19ac4.jpg\" data-blurred=\"true\"\/><\/figure>\n<h3>The Growing Pains of Large Databases<\/h3>\n<p>Data is one of the most valuable assets of any company, but as businesses grow, so does their data footprint. Initially, a small, easy-to-manage database may suffice, but as the data scales up, companies typically face several pain points:<\/p>\n<ul>\n<li>\n<p>Storage costs increase as access frequency decreases<\/p>\n<\/li>\n<li>\n<p>High-performance storage is expensive and limited<\/p>\n<\/li>\n<li>\n<p>Older data sees less usage over time<\/p>\n<\/li>\n<li>\n<p>Most data operations occur within a short window (weeks or months)<\/p>\n<\/li>\n<li>\n<p>Historical data accumulates, consuming valuable high-speed storage<\/p>\n<\/li>\n<\/ul>\n<figure class=\"full-width\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/e86\/51c\/470\/e8651c470f0e347b02b7338655a6e6fe.png\" width=\"5017\" height=\"1813\" data-src=\"https:\/\/habrastorage.org\/getpro\/habr\/upload_files\/e86\/51c\/470\/e8651c470f0e347b02b7338655a6e6fe.png\"\/><\/figure>\n<p>A one-size-fits-all storage approach is inefficient. Companies have to balance speed, cost, and data volume, while automating this process at scale \u2014 without manual intervention.<\/p>\n<h4>A Simple Analogy<\/h4>\n<p>Think of your database like a wardrobe. At first, you have plenty of space, but over time, old and rarely used items pile up. Expanding your closet is an option, but a smarter approach is moving seasonal clothes to storage \u2014 your garage, attic, or a country house. ILM does the same for your data.<\/p>\n<h3>What is Information Lifecycle Management (ILM)?<\/h3>\n<p>ILM is a data management strategy that categorizes information based on its business value and automates storage decisions according to company policies. It answers critical questions:<\/p>\n<ul>\n<li>\n<p>Which data needs instant access?<\/p>\n<\/li>\n<li>\n<p>Which data can be moved to slower, cheaper storage?<\/p>\n<\/li>\n<li>\n<p>Which data can be archived or deleted?<\/p>\n<\/li>\n<\/ul>\n<h4>Example: ILM in an E-Commerce Store<\/h4>\n<p>An online store processes thousands of transactions daily:<\/p>\n<ul>\n<li>\n<p>Active orders &amp; recent purchases are stored on fast NVMe drives for quick access<\/p>\n<\/li>\n<li>\n<p>Older orders from years ago are moved to low-cost HDD storage for analytics<\/p>\n<\/li>\n<li>\n<p>Expired data (e.g., outdated logs) are archived or deleted per company policies<\/p>\n<\/li>\n<\/ul>\n<p>By classifying data into four lifecycle stages, ILM ensures optimal performance and cost-efficiency:<\/p>\n<ol>\n<li>\n<p>Active Data \u2192 Frequently updated &amp; randomly accessed (e.g., current transactions)<\/p>\n<\/li>\n<li>\n<p>Less Active Data \u2192 Rarely modified, column-scanned (e.g., quarterly transaction history)<\/p>\n<\/li>\n<li>\n<p>Historical Data \u2192 Never changes, rarely accessed (e.g., multi-year financial records)<\/p>\n<\/li>\n<li>\n<p>Archived Data \u2192 Retained only for compliance<\/p>\n<\/li>\n<\/ol>\n<p>Each stage requires a different storage tier:<\/p>\n<ol>\n<li>\n<p>NVMe SSDs for high-speed, frequently accessed data<\/p>\n<\/li>\n<li>\n<p>Standard SSDs for moderately active data<\/p>\n<\/li>\n<li>\n<p>HDDs (compressed storage) for historical data<\/p>\n<\/li>\n<li>\n<p>External storage for long-term archives<\/p>\n<\/li>\n<\/ol>\n<figure class=\"full-width\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/b4b\/1d3\/908\/b4b1d3908efd50c259d815d76d17976b.png\" width=\"1560\" height=\"913\" data-src=\"https:\/\/habrastorage.org\/getpro\/habr\/upload_files\/b4b\/1d3\/908\/b4b1d3908efd50c259d815d76d17976b.png\"\/><\/figure>\n<p>Note: The current pgpro_ilm extension does not yet support external storage like tape drives.<\/p>\n<h3>How ILM Works in Postgres Pro Enterprise 17<\/h3>\n<p>ILM automation in pgpro_ilm is based on three key components:<\/p>\n<ul>\n<li>\n<p>Object \u2014 the target data (a table or partition)<\/p>\n<\/li>\n<li>\n<p>Condition \u2014 the trigger (e.g., data hasn&#8217;t been accessed in 6 months)<\/p>\n<\/li>\n<li>\n<p>Action \u2014 the operation (e.g., move data to a different storage tier)<\/p>\n<\/li>\n<\/ul>\n<figure class=\"full-width\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/8d5\/b28\/8ff\/8d5b288ff2786e1780304ec676fcf8f6.png\" width=\"4013\" height=\"2421\" data-src=\"https:\/\/habrastorage.org\/getpro\/habr\/upload_files\/8d5\/b28\/8ff\/8d5b288ff2786e1780304ec676fcf8f6.png\"\/><\/figure>\n<p>With these components, you can set up rules tailored to your company&#8217;s ILM strategy. For example:<\/p>\n<figure class=\"full-width\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/399\/2f3\/28a\/3992f328a456b9e2a5c567216b4a773a.png\" width=\"4788\" height=\"2421\" data-src=\"https:\/\/habrastorage.org\/getpro\/habr\/upload_files\/399\/2f3\/28a\/3992f328a456b9e2a5c567216b4a773a.png\"\/><\/figure>\n<p>First, less active data is moved to a tablespace on more affordable disks.<\/p>\n<p>Then, once the data becomes purely historical, it gets transferred to a tablespace on high-capacity, cost-effective storage.<\/p>\n<p><strong>Note:<\/strong> Rules are processed in descending order of time period. This is intentional \u2014 starting with the &#171;oldest&#187; rule makes sense because if it applies, there&#8217;s no need to check the others.<\/p>\n<h3>Processing Conditions<\/h3>\n<p>To use ILM effectively, you need data access statistics \u2014 without them, there&#8217;s no way to tell which data is &#171;new&#187; and which is &#171;old.&#187; However, Postgres Pro didn\u2019t have such statistics, so we started exploring ways to collect the necessary data. Our solution had to meet a few key requirements:<\/p>\n<ul>\n<li>\n<p>It shouldn\u2019t track system users (like postgres) or system processes (such as vacuum).<\/p>\n<\/li>\n<li>\n<p>It should allow additional exclusions for specific users\u2014useful if certain actions need to be ignored when generating statistics.<\/p>\n<\/li>\n<li>\n<p>The statistics must be persistent. They shouldn\u2019t disappear after a server restart, meaning they must be stored on disk, not just in memory.<\/p>\n<\/li>\n<\/ul>\n<p>While working on collecting DML operation statistics, we were also tackling the issue of unused privilege detection within the pgpro_usage extension. Eventually, we merged both efforts, integrating all required statistics collection into pgpro_usage. Now, everything is gathered together but remains separate \u2014 resetting statistics for unused privilege tracking doesn\u2019t affect ILM-related statistics.<\/p>\n<p><em>Note: <\/em>Think of it like a car&#8217;s odometer. It tracks both the total mileage and trip mileage separately. Resetting the trip counter doesn\u2019t erase the overall mileage.<\/p>\n<p>The pgpro_ilm extension retrieves last-access timestamps for tables, categorized by users, using functions and views from pgpro_usage. For ILM purposes, only specific actions matter, depending on the chosen rule:<\/p>\n<ul>\n<li>\n<p><strong>NO_MODIFICATION<\/strong>: Tracks UPDATE, INSERT, DELETE, and TRUNCATE.<\/p>\n<\/li>\n<li>\n<p><strong>NO_ACCESS<\/strong>: Includes the above actions plus SELECT operations.<\/p>\n<\/li>\n<\/ul>\n<figure class=\"full-width\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/3e7\/8f3\/5f3\/3e78f35f33aef8e5e4eeb5c7759f24e2.png\" width=\"1560\" height=\"578\" data-src=\"https:\/\/habrastorage.org\/getpro\/habr\/upload_files\/3e7\/8f3\/5f3\/3e78f35f33aef8e5e4eeb5c7759f24e2.png\"\/><\/figure>\n<p>With this data, you can determine when, who, and how database objects were accessed.<\/p>\n<h3>From Theory to Practice<\/h3>\n<p>Now that we\u2019ve covered how ILM works, let\u2019s get hands-on with the pgpro_ilm extension in Postgres Pro Enterprise 17.<\/p>\n<h3>Installation<\/h3>\n<p>Before setting up pgpro_ilm, you need to install pgpro_usage. Add it to shared_preload_libraries in postgresql.conf and restart the database server:<\/p>\n<pre><code class=\"sql\">CREATE EXTENSION pgpro_usage; CREATE EXTENSION pgpro_ilm;<\/code><\/pre>\n<p>That\u2019s it \u2014 ILM is ready to go.<\/p>\n<h3>Managing User Exclusions<\/h3>\n<p>You can exclude certain users from ILM tracking to avoid skewed statistics. This is useful for:<\/p>\n<ul>\n<li>\n<p>System users performing maintenance tasks.<\/p>\n<\/li>\n<li>\n<p>Internal\/external auditors who only access data occasionally.<\/p>\n<\/li>\n<li>\n<p>Users running rare reports (e.g., a one-time historical data request).<\/p>\n<\/li>\n<li>\n<p>Batch operations involving old data.<\/p>\n<\/li>\n<\/ul>\n<p>To return the list of excluded users:<\/p>\n<pre><code class=\"sql\">SELECT user_name, exclude_access, exclude_modification FROM pgpro_ilm.get_exclude_users();<\/code><\/pre>\n<p>To set the list of users whose actions are ignored when checking NO_ACCESS and NO_MODIFICATION rules.<\/p>\n<pre><code class=\"sql\">SELECT pgpro_ilm.set_exclude_users(array['exclude_access_user'], array['exclude_write_user']);<\/code><\/pre>\n<h3>Defining Rules<\/h3>\n<p>You can set up ILM rules using the add_rule function:<\/p>\n<pre><code class=\"sql\">add_rule(iv_object_name text, iv_rule_type text, iv_period interval, iv_action text, iv_parameter text) returns void.<\/code><\/pre>\n<p>Each rule has:<\/p>\n<ul>\n<li>\n<p><code>iv_object_name<\/code>: Target table.<\/p>\n<\/li>\n<li>\n<p><code>iv_rule_type<\/code>: Rule type (NO_ACCESS or NO_MODIFICATION).<\/p>\n<\/li>\n<li>\n<p><code>iv_period<\/code>: Time before the rule triggers.<\/p>\n<\/li>\n<li>\n<p><code>iv_action<\/code>: Action to take (e.g., move data).<\/p>\n<\/li>\n<li>\n<p><code>iv_parameter<\/code>: Action parameter (e.g., target tablespace).<\/p>\n<\/li>\n<\/ul>\n<p>Here is an example:<\/p>\n<pre><code class=\"sql\">SELECT pgpro_ilm.add_rule('sales_table_section_q1_2021', 'NO_ACCESS', interval '12 mons', 'ALTER_TS', 'low_cost_sales_tablespace');<\/code><\/pre>\n<p>Example policies:<\/p>\n<ul>\n<li>\n<p>Data untouched for 3+ months \u2192 Move to SSD.<\/p>\n<\/li>\n<li>\n<p>Data not queried for 6+ months \u2192 Move to HDD.<\/p>\n<\/li>\n<li>\n<p>Data unread for 12+ months \u2192 Compress.<\/p>\n<\/li>\n<\/ul>\n<figure class=\"full-width\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/f58\/922\/63a\/f5892263ab76bec7dce2d8f1bae91e11.png\" width=\"2359\" height=\"2042\" data-src=\"https:\/\/habrastorage.org\/getpro\/habr\/upload_files\/f58\/922\/63a\/f5892263ab76bec7dce2d8f1bae91e11.png\"\/><\/figure>\n<p>To remove a rule:<\/p>\n<pre><code class=\"sql\">remove_rule(iv_object_name text, iv_rule_type text, iv_period interval, iv_action text) returns void<\/code><\/pre>\n<p>Each rule has the same parameters as above excluding \u00a0iv_parameter.<\/p>\n<p>To list existing rules:<\/p>\n<pre><code class=\"sql\">get_rules([iv_object_name text]) returns table<\/code><\/pre>\n<p>Each rule has:<\/p>\n<p><code>iv_object_name<\/code> \u2014 target table.<\/p>\n<h3>Processing Rules<\/h3>\n<p>Rules can be applied manually or automated via job scheduling (recommended). Postgres Pro provides two functions for rule processing:<\/p>\n<ul>\n<li>\n<p>process_rules(iv_object_name text) RETURNS void \u2014 applies rules to a specific table, where iv_object_name is the table name.<\/p>\n<\/li>\n<li>\n<p>process_all_rules() RETURNS void \u2014 applies all rules to all tables.<\/p>\n<\/li>\n<\/ul>\n<p>For example, to process rules for a specific table section:<\/p>\n<pre><code class=\"sql\">SELECT pgpro_ilm.process_rules('sales_table_section_q1_2021');<\/code><\/pre>\n<p><em>Note: <\/em>If a table is partitioned, parent partitions are processed first, starting from the lowest level. Rules are applied in reverse order of their periods, and execution stops once a match is found.<\/p>\n<h4>Key Points About ILM<\/h4>\n<ul>\n<li>\n<p>Rules can be set for an entire partitioned table or individual partitions. If applied to a partitioned table, they are automatically inherited by all partitions.<\/p>\n<\/li>\n<li>\n<p>Partition-specific rules take priority over table-wide rules, but parent table rules still apply.<\/p>\n<\/li>\n<\/ul>\n<figure class=\"full-width\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/3ac\/fab\/9ad\/3acfab9ad31bfeda2ea6d13ded6c0df8.png\" width=\"1560\" height=\"530\" data-src=\"https:\/\/habrastorage.org\/getpro\/habr\/upload_files\/3ac\/fab\/9ad\/3acfab9ad31bfeda2ea6d13ded6c0df8.png\"\/><\/figure>\n<ul>\n<li>\n<p>Running process_rules on a partitioned table does nothing since data is stored in partitions, not the table itself.<\/p>\n<\/li>\n<li>\n<p>Currently, indexes move with the target table\/partition. Future releases will allow separate rules for indexes.<\/p>\n<\/li>\n<\/ul>\n<figure class=\"full-width\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/8ea\/dd6\/aa9\/8eadd6aa934de79766ac2f38a475280c.png\" width=\"1560\" height=\"277\" data-src=\"https:\/\/habrastorage.org\/getpro\/habr\/upload_files\/8ea\/dd6\/aa9\/8eadd6aa934de79766ac2f38a475280c.png\"\/><\/figure>\n<ul>\n<li>\n<p>If a table has already been moved, the rule won\u2019t be applied again.<\/p>\n<\/li>\n<li>\n<p>ALTER TABLE \u2026 SET TABLESPACE \u2026 uses AccessExclusiveLock, so calling process_rules again while a previous operation is still running is safe.<\/p>\n<\/li>\n<li>\n<p>If ALTER TABLE fails, re-running process_rules will retry the operation.<\/p>\n<\/li>\n<\/ul>\n<h3>Additional ILM Tools in Postgres Pro Enterprise 17<\/h3>\n<p>For even greater control over large datasets, Postgres Pro Enterprise 17 offers:<\/p>\n<ul>\n<li>\n<p>SPLIT PARTITION \/ MERGE PARTITIONS \u2192 Manage partitions dynamically<\/p>\n<\/li>\n<li>\n<p>CFS (Compression Feature Set) \u2192 Reduce storage costs by compressing tables &amp; indexes<\/p>\n<\/li>\n<li>\n<p>pgpro_autopart \u2192 Automatic partitioning based on data volume<\/p>\n<\/li>\n<li>\n<p>pgpro_bfile \u2192 Store unstructured data outside the database<\/p>\n<\/li>\n<li>\n<p>pgpro_scheduler \u2192 Automate ILM tasks with scheduled jobs<\/p>\n<\/li>\n<li>\n<p>Shardman \u2192 Create scalable, distributed PostgreSQL databases<\/p>\n<\/li>\n<\/ul>\n<h3>Final Thoughts<\/h3>\n<p>With Postgres Pro Enterprise 17, managing data at scale has never been easier. The pgpro_ilm extension gives you full control over data lifecycle automation, ensuring:<\/p>\n<ul>\n<li>\n<p>Faster databases by keeping only the most relevant data in high-performance storage<\/p>\n<\/li>\n<li>\n<p>Lower costs by offloading historical data to cheaper storage tiers<\/p>\n<\/li>\n<li>\n<p>Simplified operations with fully automated ILM policies<\/p>\n<\/li>\n<\/ul>\n<p>And the best part: pgpro_ilm is built-in\u2014no extra licensing, no complex setup. Just pure Postgres-powered efficiency<\/p>\n<\/p>\n<\/div>\n<\/div>\n<\/div>\n<p><!----><!----><\/div>\n<p><!----><!----><br \/> \u0441\u0441\u044b\u043b\u043a\u0430 \u043d\u0430 \u043e\u0440\u0438\u0433\u0438\u043d\u0430\u043b \u0441\u0442\u0430\u0442\u044c\u0438 <a href=\"https:\/\/habr.com\/ru\/articles\/891878\/\"> https:\/\/habr.com\/ru\/articles\/891878\/<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<div><!--[--><!--]--><\/div>\n<div id=\"post-content-body\">\n<div>\n<div class=\"article-formatted-body article-formatted-body article-formatted-body_version-2\">\n<div xmlns=\"http:\/\/www.w3.org\/1999\/xhtml\">\n<figure class=\"full-width\"><\/figure>\n<h3>The Growing Pains of Large Databases<\/h3>\n<p>Data is one of the most valuable assets of any company, but as businesses grow, so does their data footprint. Initially, a small, easy-to-manage database may suffice, but as the data scales up, companies typically face several pain points:<\/p>\n<ul>\n<li>\n<p>Storage costs increase as access frequency decreases<\/p>\n<\/li>\n<li>\n<p>High-performance storage is expensive and limited<\/p>\n<\/li>\n<li>\n<p>Older data sees less usage over time<\/p>\n<\/li>\n<li>\n<p>Most data operations occur within a short window (weeks or months)<\/p>\n<\/li>\n<li>\n<p>Historical data accumulates, consuming valuable high-speed storage<\/p>\n<\/li>\n<\/ul>\n<figure class=\"full-width\"><\/figure>\n<p>A one-size-fits-all storage approach is inefficient. Companies have to balance speed, cost, and data volume, while automating this process at scale \u2014 without manual intervention.<\/p>\n<h4>A Simple Analogy<\/h4>\n<p>Think of your database like a wardrobe. At first, you have plenty of space, but over time, old and rarely used items pile up. Expanding your closet is an option, but a smarter approach is moving seasonal clothes to storage \u2014 your garage, attic, or a country house. ILM does the same for your data.<\/p>\n<h3>What is Information Lifecycle Management (ILM)?<\/h3>\n<p>ILM is a data management strategy that categorizes information based on its business value and automates storage decisions according to company policies. It answers critical questions:<\/p>\n<ul>\n<li>\n<p>Which data needs instant access?<\/p>\n<\/li>\n<li>\n<p>Which data can be moved to slower, cheaper storage?<\/p>\n<\/li>\n<li>\n<p>Which data can be archived or deleted?<\/p>\n<\/li>\n<\/ul>\n<h4>Example: ILM in an E-Commerce Store<\/h4>\n<p>An online store processes thousands of transactions daily:<\/p>\n<ul>\n<li>\n<p>Active orders &amp; recent purchases are stored on fast NVMe drives for quick access<\/p>\n<\/li>\n<li>\n<p>Older orders from years ago are moved to low-cost HDD storage for analytics<\/p>\n<\/li>\n<li>\n<p>Expired data (e.g., outdated logs) are archived or deleted per company policies<\/p>\n<\/li>\n<\/ul>\n<p>By classifying data into four lifecycle stages, ILM ensures optimal performance and cost-efficiency:<\/p>\n<ol>\n<li>\n<p>Active Data \u2192 Frequently updated &amp; randomly accessed (e.g., current transactions)<\/p>\n<\/li>\n<li>\n<p>Less Active Data \u2192 Rarely modified, column-scanned (e.g., quarterly transaction history)<\/p>\n<\/li>\n<li>\n<p>Historical Data \u2192 Never changes, rarely accessed (e.g., multi-year financial records)<\/p>\n<\/li>\n<li>\n<p>Archived Data \u2192 Retained only for compliance<\/p>\n<\/li>\n<\/ol>\n<p>Each stage requires a different storage tier:<\/p>\n<ol>\n<li>\n<p>NVMe SSDs for high-speed, frequently accessed data<\/p>\n<\/li>\n<li>\n<p>Standard SSDs for moderately active data<\/p>\n<\/li>\n<li>\n<p>HDDs (compressed storage) for historical data<\/p>\n<\/li>\n<li>\n<p>External storage for long-term archives<\/p>\n<\/li>\n<\/ol>\n<figure class=\"full-width\"><\/figure>\n<p>Note: The current pgpro_ilm extension does not yet support external storage like tape drives.<\/p>\n<h3>How ILM Works in Postgres Pro Enterprise 17<\/h3>\n<p>ILM automation in pgpro_ilm is based on three key components:<\/p>\n<ul>\n<li>\n<p>Object \u2014 the target data (a table or partition)<\/p>\n<\/li>\n<li>\n<p>Condition \u2014 the trigger (e.g., data hasn&#8217;t been accessed in 6 months)<\/p>\n<\/li>\n<li>\n<p>Action \u2014 the operation (e.g., move data to a different storage tier)<\/p>\n<\/li>\n<\/ul>\n<figure class=\"full-width\"><\/figure>\n<p>With these components, you can set up rules tailored to your company&#8217;s ILM strategy. For example:<\/p>\n<figure class=\"full-width\"><\/figure>\n<p>First, less active data is moved to a tablespace on more affordable disks.<\/p>\n<p>Then, once the data becomes purely historical, it gets transferred to a tablespace on high-capacity, cost-effective storage.<\/p>\n<p><strong>Note:<\/strong> Rules are processed in descending order of time period. This is intentional \u2014 starting with the &#171;oldest&#187; rule makes sense because if it applies, there&#8217;s no need to check the others.<\/p>\n<h3>Processing Conditions<\/h3>\n<p>To use ILM effectively, you need data access statistics \u2014 without them, there&#8217;s no way to tell which data is &#171;new&#187; and which is &#171;old.&#187; However, Postgres Pro didn\u2019t have such statistics, so we started exploring ways to collect the necessary data. Our solution had to meet a few key requirements:<\/p>\n<ul>\n<li>\n<p>It shouldn\u2019t track system users (like postgres) or system processes (such as vacuum).<\/p>\n<\/li>\n<li>\n<p>It should allow additional exclusions for specific users\u2014useful if certain actions need to be ignored when generating statistics.<\/p>\n<\/li>\n<li>\n<p>The statistics must be persistent. They shouldn\u2019t disappear after a server restart, meaning they must be stored on disk, not just in memory.<\/p>\n<\/li>\n<\/ul>\n<p>While working on collecting DML operation statistics, we were also tackling the issue of unused privilege detection within the pgpro_usage extension. Eventually, we merged both efforts, integrating all required statistics collection into pgpro_usage. Now, everything is gathered together but remains separate \u2014 resetting statistics for unused privilege tracking doesn\u2019t affect ILM-related statistics.<\/p>\n<p><em>Note: <\/em>Think of it like a car&#8217;s odometer. It tracks both the total mileage and trip mileage separately. Resetting the trip counter doesn\u2019t erase the overall mileage.<\/p>\n<p>The pgpro_ilm extension retrieves last-access timestamps for tables, categorized by users, using functions and views from pgpro_usage. For ILM purposes, only specific actions matter, depending on the chosen rule:<\/p>\n<ul>\n<li>\n<p><strong>NO_MODIFICATION<\/strong>: Tracks UPDATE, INSERT, DELETE, and TRUNCATE.<\/p>\n<\/li>\n<li>\n<p><strong>NO_ACCESS<\/strong>: Includes the above actions plus SELECT operations.<\/p>\n<\/li>\n<\/ul>\n<figure class=\"full-width\"><\/figure>\n<p>With this data, you can determine when, who, and how database objects were accessed.<\/p>\n<h3>From Theory to Practice<\/h3>\n<p>Now that we\u2019ve covered how ILM works, let\u2019s get hands-on with the pgpro_ilm extension in Postgres Pro Enterprise 17.<\/p>\n<h3>Installation<\/h3>\n<p>Before setting up pgpro_ilm, you need to install pgpro_usage. Add it to shared_preload_libraries in postgresql.conf and restart the database server:<\/p>\n<pre><code class=\"sql\">CREATE EXTENSION pgpro_usage; CREATE EXTENSION pgpro_ilm;<\/code><\/pre>\n<p>That\u2019s it \u2014 ILM is ready to go.<\/p>\n<h3>Managing User Exclusions<\/h3>\n<p>You can exclude certain users from ILM tracking to avoid skewed statistics. This is useful for:<\/p>\n<ul>\n<li>\n<p>System users performing maintenance tasks.<\/p>\n<\/li>\n<li>\n<p>Internal\/external auditors who only access data occasionally.<\/p>\n<\/li>\n<li>\n<p>Users running rare reports (e.g., a one-time historical data request).<\/p>\n<\/li>\n<li>\n<p>Batch operations involving old data.<\/p>\n<\/li>\n<\/ul>\n<p>To return the list of excluded users:<\/p>\n<pre><code class=\"sql\">SELECT user_name, exclude_access, exclude_modification FROM pgpro_ilm.get_exclude_users();<\/code><\/pre>\n<p>To set the list of users whose actions are ignored when checking NO_ACCESS and NO_MODIFICATION rules.<\/p>\n<pre><code class=\"sql\">SELECT pgpro_ilm.set_exclude_users(array['exclude_access_user'], array['exclude_write_user']);<\/code><\/pre>\n<h3>Defining Rules<\/h3>\n<p>You can set up ILM rules using the add_rule function:<\/p>\n<pre><code class=\"sql\">add_rule(iv_object_name text, iv_rule_type text, iv_period interval, iv_action text, iv_parameter text) returns void.<\/code><\/pre>\n<p>Each rule has:<\/p>\n<ul>\n<li>\n<p><code>iv_object_name<\/code>: Target table.<\/p>\n<\/li>\n<li>\n<p><code>iv_rule_type<\/code>: Rule type (NO_ACCESS or NO_MODIFICATION).<\/p>\n<\/li>\n<li>\n<p><code>iv_period<\/code>: Time before the rule triggers.<\/p>\n<\/li>\n<li>\n<p><code>iv_action<\/code>: Action to take (e.g., move data).<\/p>\n<\/li>\n<li>\n<p><code>iv_parameter<\/code>: Action parameter (e.g., target tablespace).<\/p>\n<\/li>\n<\/ul>\n<p>Here is an example:<\/p>\n<pre><code class=\"sql\">SELECT pgpro_ilm.add_rule('sales_table_section_q1_2021', 'NO_ACCESS', interval '12 mons', 'ALTER_TS', 'low_cost_sales_tablespace');<\/code><\/pre>\n<p>Example policies:<\/p>\n<ul>\n<li>\n<p>Data untouched for 3+ months \u2192 Move to SSD.<\/p>\n<\/li>\n<li>\n<p>Data not queried for 6+ months \u2192 Move to HDD.<\/p>\n<\/li>\n<li>\n<p>Data unread for 12+ months \u2192 Compress.<\/p>\n<\/li>\n<\/ul>\n<figure class=\"full-width\"><\/figure>\n<p>To remove a rule:<\/p>\n<pre><code class=\"sql\">remove_rule(iv_object_name text, iv_rule_type text, iv_period interval, iv_action text) returns void<\/code><\/pre>\n<p>Each rule has the same parameters as above excluding \u00a0iv_parameter.<\/p>\n<p>To list existing rules:<\/p>\n<pre><code class=\"sql\">get_rules([iv_object_name text]) returns table<\/code><\/pre>\n<p>Each rule has:<\/p>\n<p><code>iv_object_name<\/code> \u2014 target table.<\/p>\n<h3>Processing Rules<\/h3>\n<p>Rules can be applied manually or automated via job scheduling (recommended). Postgres Pro provides two functions for rule processing:<\/p>\n<ul>\n<li>\n<p>process_rules(iv_object_name text) RETURNS void \u2014 applies rules to a specific table, where iv_object_name is the table name.<\/p>\n<\/li>\n<li>\n<p>process_all_rules() RETURNS void \u2014 applies all rules to all tables.<\/p>\n<\/li>\n<\/ul>\n<p>For example, to process rules for a specific table section:<\/p>\n<pre><code class=\"sql\">SELECT pgpro_ilm.process_rules('sales_table_section_q1_2021');<\/code><\/pre>\n<p><em>Note: <\/em>If a table is partitioned, parent partitions are processed first, starting from the lowest level. Rules are applied in reverse order of their periods, and execution stops once a match is found.<\/p>\n<h4>Key Points About ILM<\/h4>\n<ul>\n<li>\n<p>Rules can be set for an entire partitioned table or individual partitions. If applied to a partitioned table, they are automatically inherited by all partitions.<\/p>\n<\/li>\n<li>\n<p>Partition-specific rules take priority over table-wide rules, but parent table rules still apply.<\/p>\n<\/li>\n<\/ul>\n<figure class=\"full-width\"><\/figure>\n<ul>\n<li>\n<p>Running process_rules on a partitioned table does nothing since data is stored in partitions, not the table itself.<\/p>\n<\/li>\n<li>\n<p>Currently, indexes move with the target table\/partition. Future releases will allow separate rules for indexes.<\/p>\n<\/li>\n<\/ul>\n<figure class=\"full-width\"><\/figure>\n<ul>\n<li>\n<p>If a table has already been moved, the rule won\u2019t be applied again.<\/p>\n<\/li>\n<li>\n<p>ALTER TABLE \u2026 SET TABLESPACE \u2026 uses AccessExclusiveLock, so calling process_rules again while a previous operation is still running is safe.<\/p>\n<\/li>\n<li>\n<p>If ALTER TABLE fails, re-running process_rules will retry the operation.<\/p>\n<\/li>\n<\/ul>\n<h3>Additional ILM Tools in Postgres Pro Enterprise 17<\/h3>\n<p>For even greater control over large datasets, Postgres Pro Enterprise 17 offers:<\/p>\n<ul>\n<li>\n<p>SPLIT PARTITION \/ MERGE PARTITIONS \u2192 Manage partitions dynamically<\/p>\n<\/li>\n<li>\n<p>CFS (Compression Feature Set) \u2192 Reduce storage costs by compressing tables &amp; indexes<\/p>\n<\/li>\n<li>\n<p>pgpro_autopart \u2192 Automatic partitioning based on data volume<\/p>\n<\/li>\n<li>\n<p>pgpro_bfile \u2192 Store unstructured data outside the database<\/p>\n<\/li>\n<li>\n<p>pgpro_scheduler \u2192 Automate ILM tasks with scheduled jobs<\/p>\n<\/li>\n<li>\n<p>Shardman \u2192 Create scalable, distributed PostgreSQL databases<\/p>\n<\/li>\n<\/ul>\n<h3>Final Thoughts<\/h3>\n<p>With Postgres Pro Enterprise 17, managing data at scale has never been easier. The pgpro_ilm extension gives you full control over data lifecycle automation, ensuring:<\/p>\n<ul>\n<li>\n<p>Faster databases by keeping only the most relevant data in high-performance storage<\/p>\n<\/li>\n<li>\n<p>Lower costs by offloading historical data to cheaper storage tiers<\/p>\n<\/li>\n<li>\n<p>Simplified operations with fully automated ILM policies<\/p>\n<\/li>\n<\/ul>\n<p>And the best part: pgpro_ilm is built-in\u2014no extra licensing, no complex setup. Just pure Postgres-powered efficiency<\/p>\n<\/p>\n<\/div>\n<\/div>\n<\/div>\n<p><!----><!----><\/div>\n<p><!----><!----><br \/> \u0441\u0441\u044b\u043b\u043a\u0430 \u043d\u0430 \u043e\u0440\u0438\u0433\u0438\u043d\u0430\u043b \u0441\u0442\u0430\u0442\u044c\u0438 <a href=\"https:\/\/habr.com\/ru\/articles\/891878\/\"> https:\/\/habr.com\/ru\/articles\/891878\/<\/a><br \/><\/br><\/br><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[],"tags":[],"class_list":["post-452827","post","type-post","status-publish","format-standard","hentry"],"_links":{"self":[{"href":"https:\/\/savepearlharbor.com\/index.php?rest_route=\/wp\/v2\/posts\/452827","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/savepearlharbor.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/savepearlharbor.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/savepearlharbor.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/savepearlharbor.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=452827"}],"version-history":[{"count":0,"href":"https:\/\/savepearlharbor.com\/index.php?rest_route=\/wp\/v2\/posts\/452827\/revisions"}],"wp:attachment":[{"href":"https:\/\/savepearlharbor.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=452827"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/savepearlharbor.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=452827"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/savepearlharbor.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=452827"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}