{"id":473229,"date":"2025-09-02T15:52:58","date_gmt":"2025-09-02T15:52:58","guid":{"rendered":"http:\/\/savepearlharbor.com\/?p=473229"},"modified":"-0001-11-30T00:00:00","modified_gmt":"-0001-11-29T21:00:00","slug":"","status":"publish","type":"post","link":"https:\/\/savepearlharbor.com\/?p=473229","title":{"rendered":"<span>Intelligent systems at phystech: 2025 graduation<\/span>"},"content":{"rendered":"<div><!--[--><!--]--><\/div>\n<div id=\"post-content-body\">\n<div>\n<div class=\"article-formatted-body article-formatted-body article-formatted-body_version-2\">\n<div xmlns=\"http:\/\/www.w3.org\/1999\/xhtml\">\n<figure class=\"full-width\"><img decoding=\"async\" src=\"https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/5d1\/fb2\/090\/5d1fb209048ee8f1a17a623974e503ca.png\" width=\"1024\" height=\"1024\" sizes=\"auto, (max-width: 780px) 100vw, 50vw\" srcset=\"https:\/\/habrastorage.org\/r\/w780\/getpro\/habr\/upload_files\/5d1\/fb2\/090\/5d1fb209048ee8f1a17a623974e503ca.png 780w,&#10;       https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/5d1\/fb2\/090\/5d1fb209048ee8f1a17a623974e503ca.png 781w\" loading=\"lazy\" decode=\"async\"\/><\/figure>\n<p>The students of the Intelligent Systems Department successfully defended their bachelor\u2019s and master\u2019s theses. This year, 14 Bachelor\u2019s and 8 Master\u2019s students earned their degrees in Physics, Mathematics, and Computer Sciences. We are proud to say that our Department is unique in publishing the complete set of defense materials during the last ten years. These materials include the text of the <a href=\"https:\/\/intsystems.github.io\/materials\/thesis\/\" rel=\"noopener noreferrer nofollow\">dissertation work<\/a>, the published papers, the code of the computational experiments, and the slides with video of the defense talk.<\/p>\n<p>We encourage our students to publish the results of their scientific research in peer-reviewed journals. <a href=\"https:\/\/habr.com\/ru\/articles\/871802\/\" rel=\"noopener noreferrer nofollow\">In 2024, our students published 53 papers<\/a>. A great example of the thesis work is a formatted scientific paper for a BS student and several papers for an MS student. It ensures that the results of the student&#8217;s research are critically reviewed and approved by the scientific society.<\/p>\n<p>In 2025, the AI assistance became a challenge for thesis work defences. The problem is that AI assistants intrude on the text of student work. We omit many unnecessary formalities for the text preparation, like the desired number of pages and the list of performance criteria. Instead, we review the focus of the work. The personal student results shall impact the theory of machine learning. In this light, the AI assistance expands the students&#8217; experience and does not invade their texts.<\/p>\n<p>In this post, we gladly summarize the defended works of our BS and MS students and highlight the results. A recording of their pre-defence presentations can be found <a href=\"https:\/\/www.youtube.com\/watch?v=TADwppxHtIU\" rel=\"noopener noreferrer nofollow\">here<\/a> and <a href=\"https:\/\/www.youtube.com\/watch?v=M6TfX_ZHhTA\" rel=\"noopener noreferrer nofollow\">here<\/a> in Russian. Most part of the theses has a publicly available English version.<\/p>\n<p>We motivate our students to contribute the main part of their efforts to the development of the theory of Machine Learning. However, the topics Optimization and Applied Data Science Research are included in the agenda.<\/p>\n<h2>Applied methods in machine learning<\/h2>\n<p>Research in applied machine learning methods is one of the popular topics among our students, as this research has the fastest contribution to our lives.<\/p>\n<p><strong>Galina Boeva<\/strong>\u2019s master&#8217;s <a href=\"https:\/\/github.com\/intsystems\/Boeva-MS-Thesis\/blob\/master\/paper\/thesis_master_2025.pdf\" rel=\"noopener noreferrer nofollow\">thesis<\/a>, supervised by our alumnus <strong>Dr. Alexey Zaytsev<\/strong>, introduces LANET \u2014 a model for predicting timestamp sets using attention and historical data aggregation. Its key innovation is modeling label relationships, supported by theoretical analysis and attention graph visualizations. The work resulted in a publication at the ECAI <a href=\"https:\/\/ebooks.iospress.nl\/volumearticle\/70171\" rel=\"noopener noreferrer nofollow\">conference<\/a>.<\/p>\n<figure class=\"full-width\"><img decoding=\"async\" src=\"https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/01a\/919\/682\/01a919682147d6691a987b5c9a8a3fe8.png\" alt=\"From pre-print version of the paper: architecture of LANET.\" title=\"From pre-print version of the paper: architecture of LANET.\" width=\"1092\" height=\"365\" sizes=\"auto, (max-width: 780px) 100vw, 50vw\" srcset=\"https:\/\/habrastorage.org\/r\/w780\/getpro\/habr\/upload_files\/01a\/919\/682\/01a919682147d6691a987b5c9a8a3fe8.png 780w,&#10;       https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/01a\/919\/682\/01a919682147d6691a987b5c9a8a3fe8.png 781w\" loading=\"lazy\" decode=\"async\"\/><\/p>\n<div><figcaption>From pre-print version of the <a href=\"https:\/\/arxiv.org\/html\/2303.00280v3\" rel=\"noopener noreferrer nofollow\">paper<\/a>: architecture of LANET.<\/figcaption><\/div>\n<\/figure>\n<p>The master&#8217;s <a href=\"https:\/\/github.com\/intsystems\/Petrushina-MS-Thesis\/blob\/master\/paper\/Thesis2025Petrushina.pdf\" rel=\"noopener noreferrer nofollow\">thesis<\/a> of <strong>Kseniia Petrushina<\/strong>, supervised by <strong>Dr. Alexander Panchenko<\/strong>, aimed at the challenge of detecting images that appear realistic but defy common sense \u2014 like a man sleeping on a rock or a snowplow driving through sand. It introduces two methods: one based on detecting logical contradictions between atomic facts describing the image (NLI), and another that uses the hidden representations of LVLMs (Linear Probing). Predictions from the NLI-based method correlate with the presence of hallucinations in the generated facts. The methods are compared against the Through the Looking Glass (TLG) approach, which learns the importance of each fact and achieves the highest accuracy in identifying strange images, with Linear Probing as a close second.<\/p>\n<figure class=\"full-width\"><img decoding=\"async\" src=\"https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/2e3\/a31\/4bb\/2e3a314bb972479e45a4f14fbe44aac1.png\" alt=\"From pre-print version of the paper: a snow plow driving down a snowy street.\" title=\"From pre-print version of the paper: a snow plow driving down a snowy street.\" width=\"1600\" height=\"822\" sizes=\"auto, (max-width: 780px) 100vw, 50vw\" srcset=\"https:\/\/habrastorage.org\/r\/w780\/getpro\/habr\/upload_files\/2e3\/a31\/4bb\/2e3a314bb972479e45a4f14fbe44aac1.png 780w,&#10;       https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/2e3\/a31\/4bb\/2e3a314bb972479e45a4f14fbe44aac1.png 781w\" loading=\"lazy\" decode=\"async\"\/><\/p>\n<div><figcaption>From pre-print version of the <a href=\"https:\/\/arxiv.org\/pdf\/2503.15948\" rel=\"noopener noreferrer nofollow\">paper<\/a>: a snow plow driving down a snowy street.<\/figcaption><\/div>\n<\/figure>\n<p><strong>German Gritsay\u2019s<\/strong> master&#8217;s <a href=\"https:\/\/github.com\/intsystems\/Gritsai-MS-Thesis\/blob\/master\/paper\/MS_thesis.pdf\" rel=\"noopener noreferrer nofollow\">thesis<\/a>, under the supervision of <strong>Dr. Andrey Grabovoy, <\/strong>considers the problem of machine-generated text detections. The thesis author improves its interpretability, handling diverse classification problems for detecting and analyzing AI-generated fragments. It proposes attention-based architectures, statistical analysis techniques, and multi-task learning methods to regularize feature representations and improve model generalization. The research work evaluates these approaches on synthetic datasets and real-world benchmarks, including international competitions, demonstrating their practical value in multilingual and multidomain scenarios. The work investigates approaches aimed at the detection of generated document-level fragments, presenting token-level classification with segmentation algorithms for variable-length fragments. The other presented approach, multi-task learning (MTL), reduces model complexity, which is supported by theoretical analysis proving lower Rademacher complexity compared to single-task approaches. Empirical results confirm the ability of MTL to cluster textual representations in vector space, acting as an implicit regularization and improving robustness across domains and generative models.<\/p>\n<figure class=\"full-width\"><img decoding=\"async\" src=\"https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/261\/a8f\/69a\/261a8f69a6ec9810b4fb5a458467f977.png\" alt=\"From the master\u2019s thesis of\u00a0German: PCA decomposition of text embeddings after the transform-encoder.\" title=\"From the master\u2019s thesis of\u00a0German: PCA decomposition of text embeddings after the transform-encoder.\" width=\"1600\" height=\"647\" sizes=\"auto, (max-width: 780px) 100vw, 50vw\" srcset=\"https:\/\/habrastorage.org\/r\/w780\/getpro\/habr\/upload_files\/261\/a8f\/69a\/261a8f69a6ec9810b4fb5a458467f977.png 780w,&#10;       https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/261\/a8f\/69a\/261a8f69a6ec9810b4fb5a458467f977.png 781w\" loading=\"lazy\" decode=\"async\"\/><\/p>\n<div><figcaption>From the master\u2019s thesis of\u00a0German: PCA decomposition of text embeddings after the transform-encoder.<\/figcaption><\/div>\n<\/figure>\n<p>The master\u2019s <a href=\"https:\/\/github.com\/intsystems\/Mikhailov-MS-Thesis\/blob\/master\/paper\/main.pdf\" rel=\"noopener noreferrer nofollow\">thesis<\/a> of <strong>Bair Mikhailov<\/strong>, supervised by <strong>Dr. Dmitry Dylov,<\/strong> aimed at the analysis of the phylogenetic relationship between tomistomas, gavials, crocodiles, and alligator remains unresolved due to conflicting morphological and molecular evidence. This study introduces a machine learning framework to analyze brain endocasts derived from CT scans, aiming to resolve these evolutionary uncertainties. Their segmented brain endocasts from crocodilian cranial scans using a 2D U-Net architecture.\u00a0 Results demonstrate the potential of explainable deep learning to address phylogenetic controversies, offering a scalable, data-driven alternative to subjective morphological comparisons. This work bridges computational radiology and evolutionary biology, providing a template for quantitative neuroanatomical phenotyping in extinct and extant species.<\/p>\n<figure class=\"full-width\"><img decoding=\"async\" src=\"https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/b6f\/bf7\/077\/b6fbf70774f68e345aede406aa3cd7bb.png\" alt=\"From Bair's master's thesis: example of prediction of crocodilian brain endocasts.\" title=\"From Bair's master's thesis: example of prediction of crocodilian brain endocasts.\" width=\"1600\" height=\"559\" sizes=\"auto, (max-width: 780px) 100vw, 50vw\" srcset=\"https:\/\/habrastorage.org\/r\/w780\/getpro\/habr\/upload_files\/b6f\/bf7\/077\/b6fbf70774f68e345aede406aa3cd7bb.png 780w,&#10;       https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/b6f\/bf7\/077\/b6fbf70774f68e345aede406aa3cd7bb.png 781w\" loading=\"lazy\" decode=\"async\"\/><\/p>\n<div><figcaption>From Bair&#8217;s master&#8217;s thesis: example of prediction of crocodilian brain endocasts.<\/figcaption><\/div>\n<\/figure>\n<p><strong>Ildar Khabutdinov\u2019s<\/strong> master&#8217;s <a href=\"https:\/\/github.com\/intsystems\/Khabutdinov-MS-Thesis\/blob\/master\/paper\/main.pdf\" rel=\"noopener noreferrer nofollow\">thesis<\/a>, supervised by <strong>Dr. Andrey Grabovoy, <\/strong>considers two studies on grammatical error correction using the Sequence Tagging approach. The first study adapts the GECToR model for Russian, addressing the lack of annotated data by creating a synthetic dataset, achieving <img decoding=\"async\" class=\"formula inline\" source=\"F_{0.5} = 82.5\" alt=\"F_{0.5} = 82.5\" src=\"https:\/\/habrastorage.org\/getpro\/habr\/upload_files\/95c\/f1d\/d22\/95cf1dd2281a68ca7122fcf347902bf0.svg\" width=\"93\" height=\"20\" sizes=\"auto, (max-width: 780px) 100vw, 50vw\" srcset=\"https:\/\/habrastorage.org\/getpro\/habr\/upload_files\/95c\/f1d\/d22\/95cf1dd2281a68ca7122fcf347902bf0.svg 780w,&#10;       https:\/\/habrastorage.org\/getpro\/habr\/upload_files\/95c\/f1d\/d22\/95cf1dd2281a68ca7122fcf347902bf0.svg 781w\" loading=\"lazy\" decode=\"async\"\/> on synthetic data and demonstrating knowledge transfer to the RULEC test set without fine-tuning<img decoding=\"async\" class=\"formula inline\" source=\"F_{0.5} = 22.2\" alt=\"F_{0.5} = 22.2\" src=\"https:\/\/habrastorage.org\/getpro\/habr\/upload_files\/00d\/c17\/208\/00dc1720863b3c989b1031049f966b66.svg\" width=\"93\" height=\"20\" sizes=\"auto, (max-width: 780px) 100vw, 50vw\" srcset=\"https:\/\/habrastorage.org\/getpro\/habr\/upload_files\/00d\/c17\/208\/00dc1720863b3c989b1031049f966b66.svg 780w,&#10;       https:\/\/habrastorage.org\/getpro\/habr\/upload_files\/00d\/c17\/208\/00dc1720863b3c989b1031049f966b66.svg 781w\" loading=\"lazy\" decode=\"async\"\/>. The second study proposes a fully automated, annotation-free method using the Levenshtein algorithm to generate subword-level edits, which are language-agnostic and require no manual rules or dictionaries. Applied to the original GECToR model, it achieves competitive results in English: <img decoding=\"async\" class=\"formula inline\" source=\"F_{0.5} = 62.4\" alt=\"F_{0.5} = 62.4\" src=\"https:\/\/habrastorage.org\/getpro\/habr\/upload_files\/d51\/679\/3e2\/d516793e2a2c31c724201d307738f54c.svg\" width=\"93\" height=\"20\" sizes=\"auto, (max-width: 780px) 100vw, 50vw\" srcset=\"https:\/\/habrastorage.org\/getpro\/habr\/upload_files\/d51\/679\/3e2\/d516793e2a2c31c724201d307738f54c.svg 780w,&#10;       https:\/\/habrastorage.org\/getpro\/habr\/upload_files\/d51\/679\/3e2\/d516793e2a2c31c724201d307738f54c.svg 781w\" loading=\"lazy\" decode=\"async\"\/> (CoNLL-2014 dataset) and <img decoding=\"async\" class=\"formula inline\" source=\"F_{0.5} = 61.9\" alt=\"F_{0.5} = 61.9\" src=\"https:\/\/habrastorage.org\/getpro\/habr\/upload_files\/c11\/99d\/8e1\/c1199d8e154110427e76dd83378c6116.svg\" width=\"93\" height=\"20\" sizes=\"auto, (max-width: 780px) 100vw, 50vw\" srcset=\"https:\/\/habrastorage.org\/getpro\/habr\/upload_files\/c11\/99d\/8e1\/c1199d8e154110427e76dd83378c6116.svg 780w,&#10;       https:\/\/habrastorage.org\/getpro\/habr\/upload_files\/c11\/99d\/8e1\/c1199d8e154110427e76dd83378c6116.svg 781w\" loading=\"lazy\" decode=\"async\"\/> (BEA-2019 dataset). Together, these studies showcase the adaptability of Sequence Tagging models for both resource-rich and low-resource languages.<\/p>\n<figure class=\"full-width\"><img decoding=\"async\" src=\"https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/2b3\/bc6\/e0c\/2b3bc6e0c6f60b36e3176e5c04d1c365.png\" alt=\"From Ildar master\u2019s thesis: Levenshtein matrix and editing instructions between source and target sequences.\" title=\"From Ildar master\u2019s thesis: Levenshtein matrix and editing instructions between source and target sequences.\" width=\"1028\" height=\"510\" sizes=\"auto, (max-width: 780px) 100vw, 50vw\" srcset=\"https:\/\/habrastorage.org\/r\/w780\/getpro\/habr\/upload_files\/2b3\/bc6\/e0c\/2b3bc6e0c6f60b36e3176e5c04d1c365.png 780w,&#10;       https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/2b3\/bc6\/e0c\/2b3bc6e0c6f60b36e3176e5c04d1c365.png 781w\" loading=\"lazy\" decode=\"async\"\/><\/p>\n<div><figcaption>From Ildar master\u2019s thesis: Levenshtein matrix and editing instructions between source and target sequences.<\/figcaption><\/div>\n<\/figure>\n<p>The bachelor\u2019s <a href=\"https:\/\/github.com\/intsystems\/Sobolevsky-BS-Thesis\/blob\/main\/paper\/Sobolevsky2025BSThesis.pdf\" rel=\"noopener noreferrer nofollow\">thesis<\/a> study by <strong>Fedor Sobolevsky<\/strong>, under the supervision <strong>Prof. Konstantin Vorontsov,<\/strong> examines LLMs in application to the hierarchical summarization task, which implies summarizing text as a text tree that goes from key points to more specific details. The task is formalized as text tree generation with the goal of minimizing the distance between the generated and the reference summaries. Since this necessitates the use of a metric on the text tree space, a new metric, the text tree edit distance (TTED), is presented. To measure the informativeness of the metric in terms of highlighting significant aspects of text tree distance, a new metric quality factor is proposed, as well as an unbiased estimate of the factor on random tree samples. The experimental evaluation of the TTED metric using the proposed quality factor shows a significant improvement in capturing semantic and structural differences of text trees compared to a previously used similarity score and thus signifies that it can be used for hierarchical summarization scoring.<\/p>\n<figure class=\"full-width\"><img decoding=\"async\" src=\"https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/be0\/204\/f7f\/be0204f7fe0189e3a97240dfc25ca814.png\" alt=\"From Fedor\u2019s Bachelor thesis: distance Estimations Using TTED and Baseline Method.\" title=\"From Fedor\u2019s Bachelor thesis: distance Estimations Using TTED and Baseline Method.\" width=\"1588\" height=\"520\" sizes=\"auto, (max-width: 780px) 100vw, 50vw\" srcset=\"https:\/\/habrastorage.org\/r\/w780\/getpro\/habr\/upload_files\/be0\/204\/f7f\/be0204f7fe0189e3a97240dfc25ca814.png 780w,&#10;       https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/be0\/204\/f7f\/be0204f7fe0189e3a97240dfc25ca814.png 781w\" loading=\"lazy\" decode=\"async\"\/><\/p>\n<div><figcaption>From Fedor\u2019s Bachelor thesis: distance Estimations Using TTED and Baseline Method.<\/figcaption><\/div>\n<\/figure>\n<p>The <strong>Arina Chumachenko\u2019s<\/strong> master\u2019s <a href=\"https:\/\/github.com\/arina-chumachenko\/Chumachenko-MS-Thesis\/blob\/main\/paper\/Thesis_Chumachenko_MIPT.pdf\" rel=\"noopener noreferrer nofollow\">thesis<\/a>, under supervision by <strong>Prof. Ivan Oseledets, <\/strong>aimed at analysing text-to-image personalization methods like Textual Inversion and DreamBooth achieve high-fidelity concept generation, maintaining optimal balance between identity preservation and prompt adherence remains an unresolved challenge. Their work advances context regularization methods by introducing Gram-based context regularization to improve pose diversity and generation flexibility in synthesized images. A two-stage training strategy incorporating losses from a non-finetuned U-Net model enhances generalization capabilities, while optimizing context attention map regularization, mitigating overfitting and artifacts. Experimental results demonstrate that these contributions collectively improve concept fidelity and textual alignment, enabling more robust and adaptable personalized image generation within diffusion-based frameworks.<\/p>\n<figure class=\"full-width\"><img decoding=\"async\" src=\"https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/198\/066\/185\/1980661856b521fa9a144d0005fadc57.png\" alt=\"From Arina\u2019s Master\u2019s thesis: qualitative comparison of baseline methods (method without any regularizations and CoRe method) and proposed methods for \u00abbear-plushie\u00bb concepts.\" title=\"From Arina\u2019s Master\u2019s thesis: qualitative comparison of baseline methods (method without any regularizations and CoRe method) and proposed methods for \u00abbear-plushie\u00bb concepts.\" width=\"1600\" height=\"712\" sizes=\"auto, (max-width: 780px) 100vw, 50vw\" srcset=\"https:\/\/habrastorage.org\/r\/w780\/getpro\/habr\/upload_files\/198\/066\/185\/1980661856b521fa9a144d0005fadc57.png 780w,&#10;       https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/198\/066\/185\/1980661856b521fa9a144d0005fadc57.png 781w\" loading=\"lazy\" decode=\"async\"\/><\/p>\n<div><figcaption>From Arina\u2019s Master\u2019s thesis: qualitative comparison of baseline methods (method without any regularizations and CoRe method) and proposed methods for \u00abbear-plushie\u00bb concepts.<\/figcaption><\/div>\n<\/figure>\n<h2>Optimization<\/h2>\n<p>A number of this year\u2019s theses are dedicated to various aspects of optimization methods \u2014 ranging from theoretical analysis to practical algorithm design. These works hold strong scientific value and lie at the intersection of optimization theory and foundational machine learning.<\/p>\n<p>The bachelor&#8217;s <a href=\"https:\/\/github.com\/intsystems\/Rebrikov-BS-Thesis\/blob\/master\/paper\/main.pdf\" rel=\"noopener noreferrer nofollow\">thesis<\/a> of<strong> Alexey Rebrikov<\/strong>, supervised by<strong> Dr. Aleksandr Beznosikov<\/strong>, investigates the No Full Gradient SARAH algorithm\u2014a variance-reduction method for stochastic optimization that avoids computing full gradients. Theoretical analysis is provided for both convex and non-convex settings, with convergence guarantees under standard smoothness assumptions. Experiments on image classification tasks show that the algorithm maintains competitive accuracy while reducing computational costs. This work advances efficient optimization techniques for large-scale machine learning.<\/p>\n<figure class=\"full-width\"><img decoding=\"async\" src=\"https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/366\/dbd\/1f2\/366dbd1f2f02b5a353a1e86ebb767a6a.png\" alt=\"Experiments on the CIFAR-10 dataset. The proposed algorithm is comparable with SGD and outperforms the original SARAH algorithm.\" title=\"Experiments on the CIFAR-10 dataset. The proposed algorithm is comparable with SGD and outperforms the original SARAH algorithm.\" width=\"903\" height=\"586\" sizes=\"auto, (max-width: 780px) 100vw, 50vw\" srcset=\"https:\/\/habrastorage.org\/r\/w780\/getpro\/habr\/upload_files\/366\/dbd\/1f2\/366dbd1f2f02b5a353a1e86ebb767a6a.png 780w,&#10;       https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/366\/dbd\/1f2\/366dbd1f2f02b5a353a1e86ebb767a6a.png 781w\" loading=\"lazy\" decode=\"async\"\/><\/p>\n<div><figcaption>Experiments on the CIFAR-10 dataset. The proposed algorithm is comparable with SGD and outperforms the original SARAH algorithm.<\/figcaption><\/div>\n<\/figure>\n<p>The bachelor&#8217;s <a href=\"https:\/\/github.com\/intsystems\/Khafizov-BS-Thesis\/blob\/master\/paper\/BachelorThesis_paper.pdf\" rel=\"noopener noreferrer nofollow\">theses<\/a> of<strong> Fanis Khafizov<\/strong> <a href=\"https:\/\/github.com\/intsystems\/Kasiuk-BS-Thesis\/blob\/main\/docs\/Kasiuk2024CompressionForDistributedOptimization.pdf\" rel=\"noopener noreferrer nofollow\">and<\/a><strong> Vadim Kasiuk<\/strong>, supervised by <strong>Dr. Aleksandr Beznosikov<\/strong>, introduce ImpK, a novel family of importance-based compression operators for distributed learning that selects gradient coordinates based on their impact on the optimization objective rather than just magnitude or randomness. They propose multiple variants of ImpK alongside SCAM, a new error compensation mechanism that improves convergence by incorporating accumulated errors into gradient selection. Theoretical analysis establishes linear convergence rates, and experiments demonstrate superior performance compared to existing methods, especially when combined with SCAM. This work advances gradient compression by unifying existing approaches into a generalized framework with strong theoretical and empirical support.<\/p>\n<figure class=\"full-width\"><img decoding=\"async\" src=\"https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/6e9\/cb6\/557\/6e9cb6557c9e52ec0c4582195aadcf13.png\" alt=\"Convergence comparison of the proposed method and baseline algorithms on the CIFAR-10 classification task.\" title=\"Convergence comparison of the proposed method and baseline algorithms on the CIFAR-10 classification task.\" width=\"929\" height=\"689\" sizes=\"auto, (max-width: 780px) 100vw, 50vw\" srcset=\"https:\/\/habrastorage.org\/r\/w780\/getpro\/habr\/upload_files\/6e9\/cb6\/557\/6e9cb6557c9e52ec0c4582195aadcf13.png 780w,&#10;       https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/6e9\/cb6\/557\/6e9cb6557c9e52ec0c4582195aadcf13.png 781w\" loading=\"lazy\" decode=\"async\"\/><\/p>\n<div><figcaption>Convergence comparison of the proposed method and baseline algorithms on the CIFAR-10 classification task.<\/figcaption><\/div>\n<\/figure>\n<p>The bachelor&#8217;s <a href=\"https:\/\/github.com\/intsystems\/Rubtsov-BS-thesis\/blob\/master\/paper\/RUBTSOV_DIPLOMA.pdf\" rel=\"noopener noreferrer nofollow\">thesis<\/a> of<strong> Denis Rubtsov<\/strong>, supervised by <strong>Prof. Alexander Gasnikov,<\/strong> focuses on the convergence of optimization algorithms. While classical stochastic optimization results typically bound the expected number of iterations needed to reach a target accuracy, Denis\u2019s work investigates algorithms that ensure convergence with high probability\u2014limiting the risk of large deviations from the minimum. The thesis develops efficient methods that provide such guarantees under various assumptions on the objective function. The idea of Denis\u2019s thesis builds on the concept of robust distance estimation: given a set of points that are likely to lie near the optimum, it is possible to identify a central point that is very close to the true optimum with high probability. This approach enables the design of optimization algorithms with strong probabilistic convergence guarantees.<\/p>\n<figure class=\"\"><img decoding=\"async\" src=\"https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/bb2\/b6a\/1cf\/bb2b6a1cf18262a691cbbf31f670474a.png\" alt=\"From the book of Problem complexity and method efficiency in optimization (Arkadi Nemirovski and David Yudin, 1983): the explanation of the main idea of robust distance estimation.\" title=\"From the book of Problem complexity and method efficiency in optimization (Arkadi Nemirovski and David Yudin, 1983): the explanation of the main idea of robust distance estimation.\" width=\"472\" height=\"431\" sizes=\"auto, (max-width: 780px) 100vw, 50vw\" srcset=\"https:\/\/habrastorage.org\/r\/w780\/getpro\/habr\/upload_files\/bb2\/b6a\/1cf\/bb2b6a1cf18262a691cbbf31f670474a.png 780w,&#10;       https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/bb2\/b6a\/1cf\/bb2b6a1cf18262a691cbbf31f670474a.png 781w\" loading=\"lazy\" decode=\"async\"\/><\/p>\n<div><figcaption>From the book of Problem complexity and method efficiency in optimization (Arkadi Nemirovski and David Yudin, 1983): the explanation of the main idea of robust distance estimation.<\/figcaption><\/div>\n<\/figure>\n<p>The bachelor&#8217;s <a href=\"https:\/\/github.com\/intsystems\/Zadvornov_Egor_paper\/blob\/master\/paper\/main.pdf\" rel=\"noopener noreferrer nofollow\">thesis<\/a> of<strong> Egor Zadvornov<\/strong>, supervised by<strong> Dr. Artemii Malkov<\/strong>, investigates the problem of forecasting the distribution of topics in a media stream. It is shown that classical regression forecasting approaches are unstable to the occurrence of irregular cyclic changes with a variable period. The ARFilter method is proposed, which improves the accuracy of regression models by selecting a subset of dictionary words that correlate with the time distribution of the corresponding topics. For the selected words, local forecasts of their normalized shares are built, which are then aggregated into a forecast of the distribution of topics. On synthetic data, the method reduces the forecast error by ten times compared to the basic ARIMA approach. The developed approach does not depend on the choice of a specific autoregressive model and is effective for forecasting highly volatile topic distributions.<\/p>\n<h2>Machine learning fundamentals and computational mathematics <\/h2>\n<p>A large number of this year\u2019s theses focused on fundamental machine learning and applied mathematics. These works explore core principles and address key challenges in modern machine learning and data science. Each thesis demonstrates strong research potential, offering valuable insights into representation learning, generalization, model interpretability, and the mathematical foundations of learning algorithms.<\/p>\n<p><strong>Eduard Vladimirov\u2019s<\/strong> master\u2019s <a href=\"https:\/\/github.com\/intsystems\/Vladimirov-MS-Thesis\/blob\/master\/paper\/Vladimirov2024GenerativeCIPaper.pdf\" rel=\"noopener noreferrer nofollow\">thesis<\/a>, supervised by <strong>Prof. Vadim Strijov<\/strong>, introduces CaSCA \u2014 a linear autoencoder that separates latent states into causal and reconstructive components by jointly optimizing reconstruction and predictive skill. Applied to sensor data, CaSCA reduces multicollinearity, preserves explained variance, and improves forecasting and classification accuracy. The method formalizes causal dimensionality reduction and proves the identifiability of causal drivers up to rotation.<\/p>\n<figure class=\"full-width\"><img decoding=\"async\" src=\"https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/5af\/480\/f43\/5af480f437d560d65a7a9cb65a5a5c89.png\" alt=\"The scheme presents the main idea of CaSCA.\" title=\"The scheme presents the main idea of CaSCA.\" width=\"959\" height=\"630\" sizes=\"auto, (max-width: 780px) 100vw, 50vw\" srcset=\"https:\/\/habrastorage.org\/r\/w780\/getpro\/habr\/upload_files\/5af\/480\/f43\/5af480f437d560d65a7a9cb65a5a5c89.png 780w,&#10;       https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/5af\/480\/f43\/5af480f437d560d65a7a9cb65a5a5c89.png 781w\" loading=\"lazy\" decode=\"async\"\/><\/p>\n<div><figcaption>The scheme presents the main idea of CaSCA.<\/figcaption><\/div>\n<\/figure>\n<p>In connection with the mentioned work, the bachelor\u2019s <a href=\"https:\/\/github.com\/intsystems\/Eynullayev-BS-Thesis\/blob\/master\/paper\/paper.pdf\" rel=\"noopener noreferrer nofollow\">thesis<\/a> of <strong>Altay Eynullayev<\/strong>, supervised by <strong>Prof. Vadim Strijov,<\/strong> explores the use of covariance matrices of multidimensional time series to enhance forecasting accuracy. The approach leverages the Riemannian geometry of the space of symmetric positive definite (SPD) matrices to build models that incorporate covariance structure. Additionally, the thesis analyzes these matrices to identify conditions under which the method is most effective.<\/p>\n<figure class=\"\"><img decoding=\"async\" src=\"https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/6e2\/aa1\/865\/6e2aa186534392b39e0a1350bc7d3d5b.png\" alt=\"Translation between objects in covariance matrix space (Ct) and corresponding tangent space (Tc).\" title=\"Translation between objects in covariance matrix space (Ct) and corresponding tangent space (Tc).\" width=\"501\" height=\"351\" sizes=\"auto, (max-width: 780px) 100vw, 50vw\" srcset=\"https:\/\/habrastorage.org\/r\/w780\/getpro\/habr\/upload_files\/6e2\/aa1\/865\/6e2aa186534392b39e0a1350bc7d3d5b.png 780w,&#10;       https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/6e2\/aa1\/865\/6e2aa186534392b39e0a1350bc7d3d5b.png 781w\" loading=\"lazy\" decode=\"async\"\/><\/p>\n<div><figcaption>Translation between objects in covariance matrix space (Ct) and corresponding tangent space (Tc).<\/figcaption><\/div>\n<\/figure>\n<p><strong>Nikita Kornilov\u2019s<\/strong> master\u2019s <a href=\"https:\/\/github.com\/intsystems\/Kornilov_MS_Thesis\/blob\/master\/paper\/Kornilov_Nikita_Thesis_IAD_v3.pdf\" rel=\"noopener noreferrer nofollow\">thesis<\/a>, supervised by <strong>Prof. Alexander Gasnikov<\/strong>, focuses on Flow Matching. Recent advances in Flow Matching for generative modeling increasingly aim to learn straight trajectories for fast inference. However, existing methods often rely on costly iterative optimization or discrete Optimal Transport heuristics. To address this, Nikita proposed Optimal Flow Matching \u2014 a novel approach that consistently recovers optimal straight trajectories with a single loss minimization. The method combines probability transformation via ODEs from Flow Matching with straight transports from Optimal Transport. The thesis led to a <a href=\"https:\/\/openreview.net\/forum?id=kqmucDKVcU\" rel=\"noopener noreferrer nofollow\">paper<\/a> presented at NeurIPS.<\/p>\n<figure class=\"\"><img decoding=\"async\" src=\"https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/e98\/127\/3a5\/e981273a56fbb42fb1cb5ded7203e7d5.png\" alt=\"From paper: the proposed Optimal Flow Matching obtains exactly straight transport trajectories.\" title=\"From paper: the proposed Optimal Flow Matching obtains exactly straight transport trajectories.\" width=\"363\" height=\"352\" sizes=\"auto, (max-width: 780px) 100vw, 50vw\" srcset=\"https:\/\/habrastorage.org\/r\/w780\/getpro\/habr\/upload_files\/e98\/127\/3a5\/e981273a56fbb42fb1cb5ded7203e7d5.png 780w,&#10;       https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/e98\/127\/3a5\/e981273a56fbb42fb1cb5ded7203e7d5.png 781w\" loading=\"lazy\" decode=\"async\"\/><\/p>\n<div><figcaption>From <a href=\"https:\/\/openreview.net\/forum?id=kqmucDKVcU\" rel=\"noopener noreferrer nofollow\">paper<\/a>: the proposed Optimal Flow Matching obtains exactly straight transport trajectories.<\/figcaption><\/div>\n<\/figure>\n<p><strong>Marat Khusainov\u2019s<\/strong> master&#8217;s <a href=\"https:\/\/github.com\/intsystems\/Khusainov-MS-Thesis\/blob\/master\/paper\/main.pdf\" rel=\"noopener noreferrer nofollow\">thesis<\/a>, supervised by <strong>Dr. Sergey Samsonov<\/strong>, investigates Generative Flow Networks (GFlowNets)\u2014models designed to sample compositional discrete objects like graphs or strings from distributions defined by unnormalized probability mass functions. The study identifies the dormant neuron phenomenon in GFlowNets, where an increasing number of inactive neurons during training reduces network expressivity. To address this, Marat proposes a simple method: periodically checking for dormant neurons across all layers during training and reinitializing their incoming and outgoing weights. This approach was validated on biochemical tasks, showing improved GFlowNet performance.<\/p>\n<p>The bachelor\u2019s <a href=\"https:\/\/github.com\/intsystems\/2024-Project-152\/blob\/master\/paper\/thesis%20latest.pdf\" rel=\"noopener noreferrer nofollow\">thesis<\/a> of <strong>Gleb Karpeev<\/strong>, supervised by <strong>Prof. Vadim Strijov<\/strong>, tackles the problem of forecasting sets of correlated time series. The proposed framework encodes time series into covariance matrices and performs forecasting directly on the Riemannian manifold of symmetric positive definite (SPD) matrices. To model dynamics on this manifold, a Riemannian Score-Based Generative Model (RSGM) is adapted for conditional forecasting. Experiments on synthetic data demonstrate the method\u2019s effectiveness, emphasizing the importance of respecting the data\u2019s underlying geometry.<\/p>\n<figure class=\"full-width\"><img decoding=\"async\" src=\"https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/7c7\/ba9\/df7\/7c7ba9df741d069ca47d5e6adc804f79.png\" alt=\"A schematic illustration of a score-based generative model, which is the basis of Gleb\u2019s work. The forward process gradually adds noise to the data, transforming it into a Gaussian distribution. The reverse process starts from pure noise and reconstructs the original object.\" title=\"A schematic illustration of a score-based generative model, which is the basis of Gleb\u2019s work. The forward process gradually adds noise to the data, transforming it into a Gaussian distribution. The reverse process starts from pure noise and reconstructs the original object.\" width=\"727\" height=\"399\" sizes=\"auto, (max-width: 780px) 100vw, 50vw\" srcset=\"https:\/\/habrastorage.org\/r\/w780\/getpro\/habr\/upload_files\/7c7\/ba9\/df7\/7c7ba9df741d069ca47d5e6adc804f79.png 780w,&#10;       https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/7c7\/ba9\/df7\/7c7ba9df741d069ca47d5e6adc804f79.png 781w\" loading=\"lazy\" decode=\"async\"\/><\/p>\n<div><figcaption>A schematic illustration of a score-based generative model, which is the basis of Gleb\u2019s work. The forward process gradually adds noise to the data, transforming it into a Gaussian distribution. The reverse process starts from pure noise and reconstructs the original object.<\/figcaption><\/div>\n<\/figure>\n<p>In his bachelor\u2019s <a href=\"https:\/\/github.com\/intsystems\/Firsov_KOT_DARTS\/blob\/master\/paper\/main_paper.pdf\" rel=\"noopener noreferrer nofollow\">thesis<\/a>,<strong> Sergey Firsov<\/strong>, supervised by <strong>Dr. Oleg Bakhteev<\/strong>, proposes a hardware-aware Neural Architecture Search method based on DARTS, a well-known method of Neural Architecture search. The approach introduces a complexity vector with per-operation penalties and a hypernetwork that maps it to architecture logits. This enables a single training run to yield a family of models adapted to various hardware constraints.<\/p>\n<figure class=\"full-width\"><img decoding=\"async\" src=\"https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/099\/ca3\/665\/099ca3665808ebcfbedd2b4ac7c69f63.png\" alt=\"The illustration of the main idea of the proposed method, here S is a complexity vector sampled from a simplex space.\" title=\"The illustration of the main idea of the proposed method, here S is a complexity vector sampled from a simplex space.\" width=\"727\" height=\"297\" sizes=\"auto, (max-width: 780px) 100vw, 50vw\" srcset=\"https:\/\/habrastorage.org\/r\/w780\/getpro\/habr\/upload_files\/099\/ca3\/665\/099ca3665808ebcfbedd2b4ac7c69f63.png 780w,&#10;       https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/099\/ca3\/665\/099ca3665808ebcfbedd2b4ac7c69f63.png 781w\" loading=\"lazy\" decode=\"async\"\/><\/p>\n<div><figcaption>The illustration of the main idea of the proposed method, here S is a complexity vector sampled from a simplex space.<\/figcaption><\/div>\n<\/figure>\n<p><strong>Muhammadsharif Nabiev\u2019s<\/strong> bachelor&#8217;s <a href=\"https:\/\/github.com\/intsystems\/Nabiev-BS-Thesis\/blob\/master\/paper\/Nabiev2025IB.pdf\" rel=\"noopener noreferrer nofollow\">thesis<\/a> ,supervised by <strong>Dr. Oleg Bakhteev,<\/strong> also considers a model selection problem. He explores inductive bias in multitask learning through learned representations. Using a shared encoder\u2013decoder setup, the work applies the Information Bottleneck framework to study the trade-off between compression and relevance. Experiments reveal that optimal models balance high accuracy, strong compression, and low mutual information, forming a Pareto front across these objectives.<\/p>\n<figure class=\"full-width\"><img decoding=\"async\" src=\"https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/9fc\/76f\/802\/9fc76f802cc0284c130d77aa39269aab.png\" alt=\"A visualization of the discovered Pareto front: all identified models are shown in blue, the Pareto front is highlighted in orange, and the red point marks the true model.\" title=\"A visualization of the discovered Pareto front: all identified models are shown in blue, the Pareto front is highlighted in orange, and the red point marks the true model.\" width=\"572\" height=\"432\" sizes=\"auto, (max-width: 780px) 100vw, 50vw\" srcset=\"https:\/\/habrastorage.org\/r\/w780\/getpro\/habr\/upload_files\/9fc\/76f\/802\/9fc76f802cc0284c130d77aa39269aab.png 780w,&#10;       https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/9fc\/76f\/802\/9fc76f802cc0284c130d77aa39269aab.png 781w\" loading=\"lazy\" decode=\"async\"\/><\/p>\n<div><figcaption>A visualization of the discovered Pareto front: all identified models are shown in blue, the Pareto front is highlighted in orange, and the red point marks the true model.<\/figcaption><\/div>\n<\/figure>\n<p><strong>Ivan Papay\u2019s<\/strong> bachelor&#8217;s <a href=\"https:\/\/github.com\/intsystems\/Papay-BS-Thesis\/blob\/master\/paper\/PartialOrders.pdf\" rel=\"noopener noreferrer nofollow\">thesis<\/a>, supervised by <strong>Prof. Vadim Strijov<\/strong>, addresses ordinal classification with objects described by partially ordered features. He proposes aggregating these partial orders using weighted incidence matrices of order graphs. The solution is found as a projection of response variables onto a superposition of partial order cones. Ivan also introduces an optimal parameter estimation method that improves noise resistance.<\/p>\n<p>The work of <strong>Anastasia Linich<\/strong> lies at the intersection of fundamental mathematics and applied machine learning. In her bachelor&#8217;s <a href=\"https:\/\/github.com\/khilling\/diploma\/blob\/main\/%D0%92%D0%9A%D0%A0.pdf\" rel=\"noopener noreferrer nofollow\">thesis<\/a>, supervised by <strong>Prof. Serguei Barannikov,<\/strong> developed interpretable classifiers for evaluating partial Lean 4 proofs using attention maps from DeepSeek-Prover. Lean 4 is a formal proof language used for writing and verifying mathematical theorems with machine assistance. Anastasia proposed two methods \u2014 manifold topology divergence and block-end self-attention\u2014which outperformed baseline approaches. The best model achieved <img decoding=\"async\" class=\"formula inline\" source=\"62\\%\" alt=\"62\\%\" src=\"https:\/\/habrastorage.org\/getpro\/habr\/upload_files\/0f6\/994\/408\/0f6994408a8a22b8bf57843e86bae43c.svg\" width=\"36\" height=\"19\" sizes=\"auto, (max-width: 780px) 100vw, 50vw\" srcset=\"https:\/\/habrastorage.org\/getpro\/habr\/upload_files\/0f6\/994\/408\/0f6994408a8a22b8bf57843e86bae43c.svg 780w,&#10;       https:\/\/habrastorage.org\/getpro\/habr\/upload_files\/0f6\/994\/408\/0f6994408a8a22b8bf57843e86bae43c.svg 781w\" loading=\"lazy\" decode=\"async\"\/> accuracy, surpassing zero-shot prompting by <img decoding=\"async\" class=\"formula inline\" source=\"3\\%\" alt=\"3\\%\" src=\"https:\/\/habrastorage.org\/getpro\/habr\/upload_files\/7a4\/162\/e82\/7a4162e82a3a92f913cfa25cbe036c63.svg\" width=\"26\" height=\"19\" sizes=\"auto, (max-width: 780px) 100vw, 50vw\" srcset=\"https:\/\/habrastorage.org\/getpro\/habr\/upload_files\/7a4\/162\/e82\/7a4162e82a3a92f913cfa25cbe036c63.svg 780w,&#10;       https:\/\/habrastorage.org\/getpro\/habr\/upload_files\/7a4\/162\/e82\/7a4162e82a3a92f913cfa25cbe036c63.svg 781w\" loading=\"lazy\" decode=\"async\"\/>.<\/p>\n<figure class=\"full-width\"><img decoding=\"async\" src=\"https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/8ff\/9ea\/581\/8ff9ea581803c6f13fe60afa9272d410.png\" alt=\"Prompts used for evaluation Lean 4 proofs.\" title=\"Prompts used for evaluation Lean 4 proofs.\" width=\"699\" height=\"564\" sizes=\"auto, (max-width: 780px) 100vw, 50vw\" srcset=\"https:\/\/habrastorage.org\/r\/w780\/getpro\/habr\/upload_files\/8ff\/9ea\/581\/8ff9ea581803c6f13fe60afa9272d410.png 780w,&#10;       https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/8ff\/9ea\/581\/8ff9ea581803c6f13fe60afa9272d410.png 781w\" loading=\"lazy\" decode=\"async\"\/><\/p>\n<div><figcaption>Prompts used for evaluation Lean 4 proofs.<\/figcaption><\/div>\n<\/figure>\n<p><strong>Vladislav Meshkov<\/strong> bachelor&#8217;s <a href=\"https:\/\/github.com\/intsystems\/Hessian-Based-Analysis-of-Matricized-Networks\/blob\/master\/diploma\/main_diploma.pdf\" rel=\"noopener noreferrer nofollow\">thesis<\/a>, under the supervision of <strong>Dr. Andrey Grabovoy<\/strong> and <strong>Nikita Kiselev<\/strong>, aimed at the analysis of the Hessian of a neural network as an important aspect for understanding the loss landscape and the characteristics of network architecture. The Hessian matrix captures important information about the curvature, sensitivity, and local behavior of the loss function. Their work proposes a method that enhances the understanding of the local behavior of the loss function and can be used to analyze the behavior of neural networks, and also for interpreting the parameters in these networks. In the thesis, the author considers an approach to investigate the properties of the deep neural network, using the Hessian. He proposes a method for estimating the Hessian matrix norm for a specific type of convolutional network. He has obtained the results for both 1D and 2D convolutions, as well as for the fully connected head in these networks. Their empirical analysis supports these findings, demonstrating convergence in the loss function landscape. He has evaluated the Hessian norm for neural networks represented as a product of matrices and considered how this estimate affects the landscape of the loss function.<\/p>\n<figure class=\"full-width\"><img decoding=\"async\" src=\"https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/ed8\/67f\/d70\/ed867fd70fd2d95fef3b6b3c3accd7c1.png\" alt=\"From paper: Part (a) shows the loss function landscape, which is a surface in the parameter space. Part (b) shows the loss difference. It arises when one more object is added to the dataset. Here we exhibit the behavior for a dimension equal to . Near the minimum, the mean loss value for  objects tends to be similar to the same for  objects.\" title=\"From paper: Part (a) shows the loss function landscape, which is a surface in the parameter space. Part (b) shows the loss difference. It arises when one more object is added to the dataset. Here we exhibit the behavior for a dimension equal to . Near the minimum, the mean loss value for  objects tends to be similar to the same for  objects.\" width=\"685\" height=\"300\" sizes=\"auto, (max-width: 780px) 100vw, 50vw\" srcset=\"https:\/\/habrastorage.org\/r\/w780\/getpro\/habr\/upload_files\/ed8\/67f\/d70\/ed867fd70fd2d95fef3b6b3c3accd7c1.png 780w,&#10;       https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/ed8\/67f\/d70\/ed867fd70fd2d95fef3b6b3c3accd7c1.png 781w\" loading=\"lazy\" decode=\"async\"\/><\/p>\n<div><figcaption>From <a href=\"https:\/\/link.springer.com\/article\/10.1134\/S1064562424601987\" rel=\"noopener noreferrer nofollow\">paper<\/a>: Part (a) shows the loss function landscape, which is a surface in the parameter space. Part (b) shows the loss difference. It arises when one more object is added to the dataset. Here we exhibit the behavior for a dimension equal to <img decoding=\"async\" class=\"formula inline\" source=\"2\" alt=\"2\" src=\"https:\/\/habrastorage.org\/getpro\/habr\/upload_files\/85c\/8d7\/83d\/85c8d783ded447c771707e7caf77f877.svg\" width=\"8\" height=\"14\" sizes=\"auto, (max-width: 780px) 100vw, 50vw\" srcset=\"https:\/\/habrastorage.org\/getpro\/habr\/upload_files\/85c\/8d7\/83d\/85c8d783ded447c771707e7caf77f877.svg 780w,&#10;       https:\/\/habrastorage.org\/getpro\/habr\/upload_files\/85c\/8d7\/83d\/85c8d783ded447c771707e7caf77f877.svg 781w\" loading=\"lazy\" decode=\"async\"\/>. Near the minimum, the mean loss value for <img decoding=\"async\" class=\"formula inline\" source=\"k+1\" alt=\"k+1\" src=\"https:\/\/habrastorage.org\/getpro\/habr\/upload_files\/464\/e00\/7c1\/464e007c10558990da28ac05e1897f04.svg\" width=\"36\" height=\"15\" sizes=\"auto, (max-width: 780px) 100vw, 50vw\" srcset=\"https:\/\/habrastorage.org\/getpro\/habr\/upload_files\/464\/e00\/7c1\/464e007c10558990da28ac05e1897f04.svg 780w,&#10;       https:\/\/habrastorage.org\/getpro\/habr\/upload_files\/464\/e00\/7c1\/464e007c10558990da28ac05e1897f04.svg 781w\" loading=\"lazy\" decode=\"async\"\/> objects tends to be similar to the same for <img decoding=\"async\" class=\"formula inline\" source=\"k\" alt=\"k\" src=\"https:\/\/habrastorage.org\/getpro\/habr\/upload_files\/d48\/348\/6b0\/d483486b0295dfcaf8acf6a60af05242.svg\" width=\"8\" height=\"14\" sizes=\"auto, (max-width: 780px) 100vw, 50vw\" srcset=\"https:\/\/habrastorage.org\/getpro\/habr\/upload_files\/d48\/348\/6b0\/d483486b0295dfcaf8acf6a60af05242.svg 780w,&#10;       https:\/\/habrastorage.org\/getpro\/habr\/upload_files\/d48\/348\/6b0\/d483486b0295dfcaf8acf6a60af05242.svg 781w\" loading=\"lazy\" decode=\"async\"\/> objects.<\/figcaption><\/div>\n<\/figure>\n<p><strong>Ilya Stepanov<\/strong>\u2019s bachelor&#8217;s <a href=\"https:\/\/github.com\/intsystems\/Stepanov-BS-Thesis\/tree\/master\/paper\/main.pdf\" rel=\"noopener noreferrer nofollow\">thesis<\/a>, under the supervision of Dr. <strong>Andrey Grabovoy<\/strong> and <strong>Andrey Filatov<\/strong>, investigates the problem of data augmentation. Data augmentation is a crucial tool for modern object detection researchers, enabling the expansion of training datasets. However, existing methods are limited as they fail to provide substantial semantic enrichment of data. This may reduce models&#8217; generalization capabilities. The study proposes a novel augmentation technique based on semantic object replacement in images. Ilya\u2019s proposed approach enhances training datasets and improves detection model accuracy. Experiments were conducted demonstrating the proposed method&#8217;s impact on quality metrics mAP50 and mAP50-95, along with a contribution analysis of individual components to these metrics.<\/p>\n<figure class=\"full-width\"><img decoding=\"async\" src=\"https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/f0d\/43a\/29e\/f0d43a29e20f711a0ef06e01a6a823d7.png\" alt=\"From Ilya\u2019s bachelor's thesis: original images.\" title=\"From Ilya\u2019s bachelor's thesis: original images.\" width=\"842\" height=\"310\" sizes=\"auto, (max-width: 780px) 100vw, 50vw\" srcset=\"https:\/\/habrastorage.org\/r\/w780\/getpro\/habr\/upload_files\/f0d\/43a\/29e\/f0d43a29e20f711a0ef06e01a6a823d7.png 780w,&#10;       https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/f0d\/43a\/29e\/f0d43a29e20f711a0ef06e01a6a823d7.png 781w\" loading=\"lazy\" decode=\"async\"\/><\/p>\n<div><figcaption>From Ilya\u2019s bachelor&#8217;s thesis: original images.<\/figcaption><\/div>\n<\/figure>\n<figure class=\"full-width\"><img decoding=\"async\" src=\"https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/4aa\/1d4\/599\/4aa1d4599670b3107a76d4854fb019fd.png\" alt=\"From Ilya\u2019s bachelor's thesis: generated images in which the dog is changed to a bird and a cow to a sheep.\" title=\"From Ilya\u2019s bachelor's thesis: generated images in which the dog is changed to a bird and a cow to a sheep.\" width=\"844\" height=\"310\" sizes=\"auto, (max-width: 780px) 100vw, 50vw\" srcset=\"https:\/\/habrastorage.org\/r\/w780\/getpro\/habr\/upload_files\/4aa\/1d4\/599\/4aa1d4599670b3107a76d4854fb019fd.png 780w,&#10;       https:\/\/habrastorage.org\/r\/w1560\/getpro\/habr\/upload_files\/4aa\/1d4\/599\/4aa1d4599670b3107a76d4854fb019fd.png 781w\" loading=\"lazy\" decode=\"async\"\/><\/p>\n<div><figcaption>From Ilya\u2019s bachelor&#8217;s thesis: generated images in which the dog is changed to a bird and a cow to a sheep.<\/figcaption><\/div>\n<\/figure>\n<h2>Conclusion<\/h2>\n<p>The Intelligent Systems Department developed a sustainable process for student research, defence, and graduation. It starts immediately after a student joins our Department and meets their scientific advisor. Our advisors are obliged to have a Ph.D. or D.Sc. in physics and mathematics. They create a long-term research plan for students that meets their career goals, accounts for risks, and forecasts responses from the scientific society. During the study, all research projects cross checkpoints with presentations and discussions. We present two of the checkpoints. First, the 2025 student conference, which, along with the Department\u2019s students, gathers students from other <a href=\"https:\/\/www.youtube.com\/watch?v=5br4xqTASf4\" rel=\"noopener noreferrer nofollow\">universities<\/a>. Second, the 2025 predefence sessions, where students present their work and receive feedback from the faculty, are available for <a href=\"https:\/\/www.youtube.com\/watch?v=TADwppxHtIU\" rel=\"noopener noreferrer nofollow\">MS<\/a> and <a href=\"https:\/\/www.youtube.com\/watch?v=M6TfX_ZHhTA\" rel=\"noopener noreferrer nofollow\">BS<\/a> works.\u00a0<\/p>\n<p>The Intelligent Systems Department ensures that the defended theses are of scientific publication quality. Many of the thesis works are either already published or currently under review at leading machine learning conferences and journals.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<p><!----><!----><\/div>\n<p><!----><!----><br \/> \u0441\u0441\u044b\u043b\u043a\u0430 \u043d\u0430 \u043e\u0440\u0438\u0433\u0438\u043d\u0430\u043b \u0441\u0442\u0430\u0442\u044c\u0438 <a href=\"https:\/\/habr.com\/ru\/articles\/931468\/\"> https:\/\/habr.com\/ru\/articles\/931468\/<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<div><!--[--><!--]--><\/div>\n<div id=\"post-content-body\">\n<div>\n<div class=\"article-formatted-body article-formatted-body article-formatted-body_version-2\">\n<div xmlns=\"http:\/\/www.w3.org\/1999\/xhtml\">\n<figure class=\"full-width\"><\/figure>\n<p>The students of the Intelligent Systems Department successfully defended their bachelor\u2019s and master\u2019s theses. This year, 14 Bachelor\u2019s and 8 Master\u2019s students earned their degrees in Physics, Mathematics, and Computer Sciences. We are proud to say that our Department is unique in publishing the complete set of defense materials during the last ten years. These materials include the text of the <a href=\"https:\/\/intsystems.github.io\/materials\/thesis\/\" rel=\"noopener noreferrer nofollow\">dissertation work<\/a>, the published papers, the code of the computational experiments, and the slides with video of the defense talk.<\/p>\n<p>We encourage our students to publish the results of their scientific research in peer-reviewed journals. <a href=\"https:\/\/habr.com\/ru\/articles\/871802\/\" rel=\"noopener noreferrer nofollow\">In 2024, our students published 53 papers<\/a>. A great example of the thesis work is a formatted scientific paper for a BS student and several papers for an MS student. It ensures that the results of the student&#8217;s research are critically reviewed and approved by the scientific society.<\/p>\n<p>In 2025, the AI assistance became a challenge for thesis work defences. The problem is that AI assistants intrude on the text of student work. We omit many unnecessary formalities for the text preparation, like the desired number of pages and the list of performance criteria. Instead, we review the focus of the work. The personal student results shall impact the theory of machine learning. In this light, the AI assistance expands the students&#8217; experience and does not invade their texts.<\/p>\n<p>In this post, we gladly summarize the defended works of our BS and MS students and highlight the results. A recording of their pre-defence presentations can be found <a href=\"https:\/\/www.youtube.com\/watch?v=TADwppxHtIU\" rel=\"noopener noreferrer nofollow\">here<\/a> and <a href=\"https:\/\/www.youtube.com\/watch?v=M6TfX_ZHhTA\" rel=\"noopener noreferrer nofollow\">here<\/a> in Russian. Most part of the theses has a publicly available English version.<\/p>\n<p>We motivate our students to contribute the main part of their efforts to the development of the theory of Machine Learning. However, the topics Optimization and Applied Data Science Research are included in the agenda.<\/p>\n<h2>Applied methods in machine learning<\/h2>\n<p>Research in applied machine learning methods is one of the popular topics among our students, as this research has the fastest contribution to our lives.<\/p>\n<p><strong>Galina Boeva<\/strong>\u2019s master&#8217;s <a href=\"https:\/\/github.com\/intsystems\/Boeva-MS-Thesis\/blob\/master\/paper\/thesis_master_2025.pdf\" rel=\"noopener noreferrer nofollow\">thesis<\/a>, supervised by our alumnus <strong>Dr. Alexey Zaytsev<\/strong>, introduces LANET \u2014 a model for predicting timestamp sets using attention and historical data aggregation. Its key innovation is modeling label relationships, supported by theoretical analysis and attention graph visualizations. The work resulted in a publication at the ECAI <a href=\"https:\/\/ebooks.iospress.nl\/volumearticle\/70171\" rel=\"noopener noreferrer nofollow\">conference<\/a>.<\/p>\n<figure class=\"full-width\">\n<div><figcaption>From pre-print version of the <a href=\"https:\/\/arxiv.org\/html\/2303.00280v3\" rel=\"noopener noreferrer nofollow\">paper<\/a>: architecture of LANET.<\/figcaption><\/div>\n<\/figure>\n<p>The master&#8217;s <a href=\"https:\/\/github.com\/intsystems\/Petrushina-MS-Thesis\/blob\/master\/paper\/Thesis2025Petrushina.pdf\" rel=\"noopener noreferrer nofollow\">thesis<\/a> of <strong>Kseniia Petrushina<\/strong>, supervised by <strong>Dr. Alexander Panchenko<\/strong>, aimed at the challenge of detecting images that appear realistic but defy common sense \u2014 like a man sleeping on a rock or a snowplow driving through sand. It introduces two methods: one based on detecting logical contradictions between atomic facts describing the image (NLI), and another that uses the hidden representations of LVLMs (Linear Probing). Predictions from the NLI-based method correlate with the presence of hallucinations in the generated facts. The methods are compared against the Through the Looking Glass (TLG) approach, which learns the importance of each fact and achieves the highest accuracy in identifying strange images, with Linear Probing as a close second.<\/p>\n<figure class=\"full-width\">\n<div><figcaption>From pre-print version of the <a href=\"https:\/\/arxiv.org\/pdf\/2503.15948\" rel=\"noopener noreferrer nofollow\">paper<\/a>: a snow plow driving down a snowy street.<\/figcaption><\/div>\n<\/figure>\n<p><strong>German Gritsay\u2019s<\/strong> master&#8217;s <a href=\"https:\/\/github.com\/intsystems\/Gritsai-MS-Thesis\/blob\/master\/paper\/MS_thesis.pdf\" rel=\"noopener noreferrer nofollow\">thesis<\/a>, under the supervision of <strong>Dr. Andrey Grabovoy, <\/strong>considers the problem of machine-generated text detections. The thesis author improves its interpretability, handling diverse classification problems for detecting and analyzing AI-generated fragments. It proposes attention-based architectures, statistical analysis techniques, and multi-task learning methods to regularize feature representations and improve model generalization. The research work evaluates these approaches on synthetic datasets and real-world benchmarks, including international competitions, demonstrating their practical value in multilingual and multidomain scenarios. The work investigates approaches aimed at the detection of generated document-level fragments, presenting token-level classification with segmentation algorithms for variable-length fragments. The other presented approach, multi-task learning (MTL), reduces model complexity, which is supported by theoretical analysis proving lower Rademacher complexity compared to single-task approaches. Empirical results confirm the ability of MTL to cluster textual representations in vector space, acting as an implicit regularization and improving robustness across domains and generative models.<\/p>\n<figure class=\"full-width\">\n<div><figcaption>From the master\u2019s thesis of\u00a0German: PCA decomposition of text embeddings after the transform-encoder.<\/figcaption><\/div>\n<\/figure>\n<p>The master\u2019s <a href=\"https:\/\/github.com\/intsystems\/Mikhailov-MS-Thesis\/blob\/master\/paper\/main.pdf\" rel=\"noopener noreferrer nofollow\">thesis<\/a> of <strong>Bair Mikhailov<\/strong>, supervised by <strong>Dr. Dmitry Dylov,<\/strong> aimed at the analysis of the phylogenetic relationship between tomistomas, gavials, crocodiles, and alligator remains unresolved due to conflicting morphological and molecular evidence. This study introduces a machine learning framework to analyze brain endocasts derived from CT scans, aiming to resolve these evolutionary uncertainties. Their segmented brain endocasts from crocodilian cranial scans using a 2D U-Net architecture.\u00a0 Results demonstrate the potential of explainable deep learning to address phylogenetic controversies, offering a scalable, data-driven alternative to subjective morphological comparisons. This work bridges computational radiology and evolutionary biology, providing a template for quantitative neuroanatomical phenotyping in extinct and extant species.<\/p>\n<figure class=\"full-width\">\n<div><figcaption>From Bair&#8217;s master&#8217;s thesis: example of prediction of crocodilian brain endocasts.<\/figcaption><\/div>\n<\/figure>\n<p><strong>Ildar Khabutdinov\u2019s<\/strong> master&#8217;s <a href=\"https:\/\/github.com\/intsystems\/Khabutdinov-MS-Thesis\/blob\/master\/paper\/main.pdf\" rel=\"noopener noreferrer nofollow\">thesis<\/a>, supervised by <strong>Dr. Andrey Grabovoy, <\/strong>considers two studies on grammatical error correction using the Sequence Tagging approach. The first study adapts the GECToR model for Russian, addressing the lack of annotated data by creating a synthetic dataset, achieving  on synthetic data and demonstrating knowledge transfer to the RULEC test set without fine-tuning. The second study proposes a fully automated, annotation-free method using the Levenshtein algorithm to generate subword-level edits, which are language-agnostic and require no manual rules or dictionaries. Applied to the original GECToR model, it achieves competitive results in English:  (CoNLL-2014 dataset) and  (BEA-2019 dataset). Together, these studies showcase the adaptability of Sequence Tagging models for both resource-rich and low-resource languages.<\/p>\n<figure class=\"full-width\">\n<div><figcaption>From Ildar master\u2019s thesis: Levenshtein matrix and editing instructions between source and target sequences.<\/figcaption><\/div>\n<\/figure>\n<p>The bachelor\u2019s <a href=\"https:\/\/github.com\/intsystems\/Sobolevsky-BS-Thesis\/blob\/main\/paper\/Sobolevsky2025BSThesis.pdf\" rel=\"noopener noreferrer nofollow\">thesis<\/a> study by <strong>Fedor Sobolevsky<\/strong>, under the supervision <strong>Prof. Konstantin Vorontsov,<\/strong> examines LLMs in application to the hierarchical summarization task, which implies summarizing text as a text tree that goes from key points to more specific details. The task is formalized as text tree generation with the goal of minimizing the distance between the generated and the reference summaries. Since this necessitates the use of a metric on the text tree space, a new metric, the text tree edit distance (TTED), is presented. To measure the informativeness of the metric in terms of highlighting significant aspects of text tree distance, a new metric quality factor is proposed, as well as an unbiased estimate of the factor on random tree samples. The experimental evaluation of the TTED metric using the proposed quality factor shows a significant improvement in capturing semantic and structural differences of text trees compared to a previously used similarity score and thus signifies that it can be used for hierarchical summarization scoring.<\/p>\n<figure class=\"full-width\">\n<div><figcaption>From Fedor\u2019s Bachelor thesis: distance Estimations Using TTED and Baseline Method.<\/figcaption><\/div>\n<\/figure>\n<p>The <strong>Arina Chumachenko\u2019s<\/strong> master\u2019s <a href=\"https:\/\/github.com\/arina-chumachenko\/Chumachenko-MS-Thesis\/blob\/main\/paper\/Thesis_Chumachenko_MIPT.pdf\" rel=\"noopener noreferrer nofollow\">thesis<\/a>, under supervision by <strong>Prof. Ivan Oseledets, <\/strong>aimed at analysing text-to-image personalization methods like Textual Inversion and DreamBooth achieve high-fidelity concept generation, maintaining optimal balance between identity preservation and prompt adherence remains an unresolved challenge. Their work advances context regularization methods by introducing Gram-based context regularization to improve pose diversity and generation flexibility in synthesized images. A two-stage training strategy incorporating losses from a non-finetuned U-Net model enhances generalization capabilities, while optimizing context attention map regularization, mitigating overfitting and artifacts. Experimental results demonstrate that these contributions collectively improve concept fidelity and textual alignment, enabling more robust and adaptable personalized image generation within diffusion-based frameworks.<\/p>\n<figure class=\"full-width\">\n<div><figcaption>From Arina\u2019s Master\u2019s thesis: qualitative comparison of baseline methods (method without any regularizations and CoRe method) and proposed methods for \u00abbear-plushie\u00bb concepts.<\/figcaption><\/div>\n<\/figure>\n<h2>Optimization<\/h2>\n<p>A number of this year\u2019s theses are dedicated to various aspects of optimization methods \u2014 ranging from theoretical analysis to practical algorithm design. These works hold strong scientific value and lie at the intersection of optimization theory and foundational machine learning.<\/p>\n<p>The bachelor&#8217;s <a href=\"https:\/\/github.com\/intsystems\/Rebrikov-BS-Thesis\/blob\/master\/paper\/main.pdf\" rel=\"noopener noreferrer nofollow\">thesis<\/a> of<strong> Alexey Rebrikov<\/strong>, supervised by<strong> Dr. Aleksandr Beznosikov<\/strong>, investigates the No Full Gradient SARAH algorithm\u2014a variance-reduction method for stochastic optimization that avoids computing full gradients. Theoretical analysis is provided for both convex and non-convex settings, with convergence guarantees under standard smoothness assumptions. Experiments on image classification tasks show that the algorithm maintains competitive accuracy while reducing computational costs. This work advances efficient optimization techniques for large-scale machine learning.<\/p>\n<figure class=\"full-width\">\n<div><figcaption>Experiments on the CIFAR-10 dataset. The proposed algorithm is comparable with SGD and outperforms the original SARAH algorithm.<\/figcaption><\/div>\n<\/figure>\n<p>The bachelor&#8217;s <a href=\"https:\/\/github.com\/intsystems\/Khafizov-BS-Thesis\/blob\/master\/paper\/BachelorThesis_paper.pdf\" rel=\"noopener noreferrer nofollow\">theses<\/a> of<strong> Fanis Khafizov<\/strong> <a href=\"https:\/\/github.com\/intsystems\/Kasiuk-BS-Thesis\/blob\/main\/docs\/Kasiuk2024CompressionForDistributedOptimization.pdf\" rel=\"noopener noreferrer nofollow\">and<\/a><strong> Vadim Kasiuk<\/strong>, supervised by <strong>Dr. Aleksandr Beznosikov<\/strong>, introduce ImpK,<\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[],"tags":[],"class_list":["post-473229","post","type-post","status-publish","format-standard","hentry"],"_links":{"self":[{"href":"https:\/\/savepearlharbor.com\/index.php?rest_route=\/wp\/v2\/posts\/473229","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/savepearlharbor.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/savepearlharbor.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/savepearlharbor.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/savepearlharbor.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=473229"}],"version-history":[{"count":0,"href":"https:\/\/savepearlharbor.com\/index.php?rest_route=\/wp\/v2\/posts\/473229\/revisions"}],"wp:attachment":[{"href":"https:\/\/savepearlharbor.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=473229"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/savepearlharbor.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=473229"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/savepearlharbor.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=473229"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}