[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"article-hycnns-convex-learning-optimal-transport-en":3,"tags-hycnns-convex-learning-optimal-transport-en":30,"related-lang-hycnns-convex-learning-optimal-transport-en":41,"related-posts-hycnns-convex-learning-optimal-transport-en":45,"series-research-a05a33ed-3b47-456a-9dd5-9582c0e10bf1":82},{"id":4,"title":5,"content":6,"summary":7,"source":8,"source_url":9,"author":10,"image_url":11,"keywords":12,"language":18,"translated_content":10,"views":19,"is_premium":20,"created_at":21,"updated_at":21,"cover_image":11,"published_at":22,"rewrite_status":23,"rewrite_error":10,"rewritten_from_id":24,"slug":25,"category":26,"related_article_id":27,"status":28,"google_indexed_at":29,"x_posted_at":10,"tweet_text":10,"title_rewritten_at":10,"title_original":10,"key_takeaways":10,"topic_cluster_id":10,"embedding":10,"is_canonical_seed":20},"a05a33ed-3b47-456a-9dd5-9582c0e10bf1","HyCNNs: A Better Way to Learn Convex Functions","\u003Cp data-speakable=\"summary\">HyCNNs are a new convex network design that can scale better than ICNNs.\u003C\u002Fp>\u003Cp>Most neural nets are built to fit data, not to respect geometry. This paper tackles a narrower but very practical problem: learning functions that must stay convex, which matters in shape-constrained regression, interpolation, and optimal transport. The authors introduce \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2604.26942\">Hyper Input Convex Neural Networks\u003C\u002Fa>, or HyCNNs, and argue that they keep the convexity guarantees of input convex neural networks while being easier to scale and more expressive in practice.\u003C\u002Fp>\u003Cp>For engineers, the appeal is straightforward. If your model has to obey a structural constraint, you usually end up trading off flexibility, stability, and parameter count. HyCNNs are presented as a way to reduce that tradeoff. The paper claims the architecture is always convex in its input, can leverage depth, and performs reliably when trained at scale compared with standard ICNNs.\u003C\u002Fp>\u003Ch2>What problem this paper is trying to fix\u003C\u002Fh2>\u003Cp>Input convex neural networks, or ICNNs, are designed for learning convex functions by construction. That makes them useful whenever convexity is not optional but part of the problem definition. The catch is that these models can be expensive to express the kinds of functions people actually want, especially when the target function has more structure than a shallow convex form can capture.\u003C\u002Fp>\n\u003Cfigure class=\"my-6\">\u003Cimg src=\"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1777530055596-owdg.png\" alt=\"HyCNNs: A Better Way to Learn Convex Functions\" class=\"rounded-xl w-full\" loading=\"lazy\" \u002F>\u003C\u002Ffigure>\n\u003Cp>The paper’s core complaint is that existing ICNNs may need a lot of parameters to approximate even simple convex functions well. The authors focus on quadratic functions as a concrete example and show that, in theory, HyCNNs need exponentially fewer parameters than ICNNs to reach a given approximation precision. That is a strong claim about representational efficiency, and it is the main technical motivation for the new architecture.\u003C\u002Fp>\u003Cp>This matters because shape-constrained learning is often used when you care about interpretability, monotonic behavior, or physically plausible outputs. In those settings, a model that respects the constraint but burns too many parameters is harder to train, harder to deploy, and less attractive for large-scale use.\u003C\u002Fp>\u003Ch2>How HyCNNs work in plain English\u003C\u002Fh2>\u003Cp>HyCNNs combine ideas from Maxout networks and ICNNs. The result is a network that remains convex in its input by design, but is meant to be more expressive than a conventional ICNN. The paper frames this as a way to keep the mathematical guarantee while giving the model more room to use depth effectively.\u003C\u002Fp>\u003Cp>You can think of it as an architecture-level fix rather than a training trick. Instead of hoping a generic neural net learns convexity from data, HyCNNs bake convexity into the model structure. That means the constraint is not something you check after training; it is part of the network itself.\u003C\u002Fp>\u003Cp>The “hyper” part of the name signals that the architecture is built to improve on the standard ICNN template. The abstract does not spell out every implementation detail, but the high-level idea is clear: use Maxout-like building blocks inside a convex network so the model can represent more complex convex shapes without giving up the guarantee that the output remains convex in the input.\u003C\u002Fp>\u003Cp>That combination is important because convexity-constrained models often become too rigid when scaled naively. By aiming for a design that is still convex but more parameter-efficient, HyCNNs try to make structural constraints less painful in practice.\u003C\u002Fp>\u003Ch2>What the paper actually shows\u003C\u002Fh2>\u003Cp>The paper makes one theoretical claim and several empirical ones. On the theory side, the authors prove that HyCNNs require exponentially fewer parameters than ICNNs to approximate quadratic functions up to a chosen precision. The abstract does not provide the full proof details, but the point is that the new architecture has a much better approximation efficiency for at least this family of functions.\u003C\u002Fp>\n\u003Cfigure class=\"my-6\">\u003Cimg src=\"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1777530057232-0fri.png\" alt=\"HyCNNs: A Better Way to Learn Convex Functions\" class=\"rounded-xl w-full\" loading=\"lazy\" \u002F>\u003C\u002Ffigure>\n\u003Cp>On the experimental side, the paper reports synthetic experiments for convex regression and interpolation. In those tests, HyCNNs outperform existing ICNNs and plain MLPs in predictive performance. The abstract does not include the exact benchmark numbers, dataset sizes, or error metrics, so those details are not available here and should not be inferred.\u003C\u002Fp>\u003Cp>The authors also apply HyCNNs to high-dimensional optimal transport, both on synthetic examples and on single-cell RNA sequencing data. In those experiments, HyCNNs often outperform ICNN-based neural optimal transport methods and other baselines across a wide range of settings. Again, the abstract does not give the exact transport error values or runtime comparisons, so the available evidence is directional rather than fully quantified in the source text.\u003C\u002Fp>\u003Cul>\u003Cli>Theory: exponential parameter savings over ICNNs for approximating quadratic functions.\u003C\u002Fli>\u003Cli>Synthetic tasks: better predictive performance on convex regression and interpolation.\u003C\u002Fli>\u003Cli>Applied tasks: strong results on high-dimensional optimal transport, including single-cell RNA sequencing data.\u003C\u002Fli>\u003Cli>Comparisons: improvements over ICNNs, MLPs, and other baselines mentioned in the abstract.\u003C\u002Fli>\u003C\u002Ful>\u003Ch2>Why developers should care\u003C\u002Fh2>\u003Cp>If you build models for constrained prediction, this paper points to a useful design pattern: make the constraint part of the architecture, not a post-processing step. That can simplify training and make the behavior of the model easier to reason about. For convex problems, that also means you are less likely to accidentally violate the assumptions the rest of your pipeline depends on.\u003C\u002Fp>\u003Cp>The optimal transport angle is especially practical. Neural optimal transport methods are often used when you need to learn a map between distributions in high dimensions. If the learned map should be convex, or if convexity is a useful inductive bias for stability and generalization, HyCNNs may offer a more scalable option than older ICNN-based approaches.\u003C\u002Fp>\u003Cp>There is also a broader engineering lesson here: architecture choices can matter as much as optimization tricks. The paper suggests that by changing the network family itself, you can get better approximation efficiency and better empirical performance without abandoning the structural guarantees that make the model useful in the first place.\u003C\u002Fp>\u003Ch2>Limitations and open questions\u003C\u002Fh2>\u003Cp>The abstract gives a promising picture, but it leaves several practical questions unanswered. We do not get exact benchmark numbers, training details, parameter counts from the experiments, or ablation results showing which part of the architecture contributes most to the gains. That makes it hard to judge how much of the improvement comes from the Maxout-style design versus other implementation choices.\u003C\u002Fp>\u003Cp>The strongest theoretical claim is also limited to quadratic functions and approximation precision. That is still meaningful, but it does not automatically tell us how HyCNNs behave on every convex learning problem engineers might care about. The paper reports synthetic experiments and one real-world-inspired application, but the abstract does not establish broad production readiness.\u003C\u002Fp>\u003Cp>So the right takeaway is not that HyCNNs replace all ICNNs. It is that they look like a serious new option when you need convexity, want better parameter efficiency, and care about scaling beyond toy examples. The paper argues that those goals do not have to be in conflict, and that is the main reason this work is worth watching.\u003C\u002Fp>","HyCNNs mix Maxout and ICNN ideas to learn convex functions more efficiently and improve convex regression and optimal transport tasks.","arxiv.org","https:\u002F\u002Farxiv.org\u002Fabs\u002F2604.26942",null,"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1777530055596-owdg.png",[13,14,15,16,17],"convex neural networks","ICNN","optimal transport","shape-constrained learning","Maxout","en",1,false,"2026-04-30T06:20:37.479646+00:00","2026-04-30T06:20:37.464+00:00","done","e3671537-a6b7-4e83-ac6f-144edd5183f8","hycnns-convex-learning-optimal-transport-en","research","84f424fe-d45e-485f-994c-62f2e9f407b9","published","2026-04-30T09:00:07.454+00:00",[31,33,35,37,39],{"name":16,"slug":32},"shape-constrained-learning",{"name":14,"slug":34},"icnn",{"name":13,"slug":36},"convex-neural-networks",{"name":15,"slug":38},"optimal-transport",{"name":17,"slug":40},"maxout",{"id":27,"slug":42,"title":43,"language":44},"hycnns-convex-learning-optimal-transport-zh","HyCNNs：更省參數的凸函數學習","zh",[46,52,58,64,70,76],{"id":47,"slug":48,"title":49,"cover_image":50,"image_url":50,"created_at":51,"category":26},"94994abd-e24d-4fd1-b941-942d03d19acf","turboquant-seo-shift-small-sites-en","TurboQuant and the SEO Shift for Small Sites","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1778840455122-jfce.png","2026-05-15T10:20:28.134545+00:00",{"id":53,"slug":54,"title":55,"cover_image":56,"image_url":56,"created_at":57,"category":26},"670a7f69-911f-41e8-a18b-7d3491253a19","turboquant-vllm-comparison-fp8-kv-cache-en","TurboQuant vs FP8: vLLM’s first broad test","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1778839858405-b5ao.png","2026-05-15T10:10:37.219158+00:00",{"id":59,"slug":60,"title":61,"cover_image":62,"image_url":62,"created_at":63,"category":26},"5aef1c57-961f-49f7-8277-f83f7336799a","llmbda-calculus-agent-safety-rules-en","LLMbda calculus gives agents safety rules","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1778825459914-obkf.png","2026-05-15T06:10:36.242145+00:00",{"id":65,"slug":66,"title":67,"cover_image":68,"image_url":68,"created_at":69,"category":26},"712a0357-f7cd-48f2-adde-c2691da0815f","low-complexity-beamspace-denoiser-mmwave-mimo-en","A simpler beamspace denoiser for mmWave MIMO","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1778814646705-e7mx.png","2026-05-15T03:10:31.764301+00:00",{"id":71,"slug":72,"title":73,"cover_image":74,"image_url":74,"created_at":75,"category":26},"f595f949-6ea1-4b0e-a632-f1832ef26e36","ai-benchmark-wins-cyber-scare-defenders-en","Why AI benchmark wins in cyber should scare defenders","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1778807444539-gz7f.png","2026-05-15T01:10:30.04579+00:00",{"id":77,"slug":78,"title":79,"cover_image":80,"image_url":80,"created_at":81,"category":26},"3ad202d1-9e5f-49c5-8383-02fcf1a23cf2","why-linux-security-needs-patch-wave-mindset-en","Why Linux security needs a patch-wave mindset","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1778741441493-ikl6.png","2026-05-14T06:50:25.906256+00:00",[83,88,93,98,103,108,113,118,123,128],{"id":84,"slug":85,"title":86,"created_at":87},"a2715e72-1fe8-41b3-abb1-d0cf1f710189","ai-predictions-2026-big-changes-en","AI Predictions for 2026: Brace for Big Changes","2026-03-26T01:25:07.788356+00:00",{"id":89,"slug":90,"title":91,"created_at":92},"8404bd7b-4c2f-4109-9ec4-baf29d88af2b","ml-papers-of-the-week-github-research-desk-en","ML Papers of the Week Turns GitHub Into a Research Desk","2026-03-27T01:11:39.480259+00:00",{"id":94,"slug":95,"title":96,"created_at":97},"87897a94-8065-4464-a016-1f23e89e17cc","ai-ml-conferences-to-watch-in-2026-en","AI\u002FML Conferences to Watch in 2026","2026-03-27T01:51:54.184108+00:00",{"id":99,"slug":100,"title":101,"created_at":102},"6f1987cf-25f3-47a4-b3e6-db0997695be8","openclaw-agents-manipulated-self-sabotage-en","OpenClaw Agents Can Be Manipulated Into Failure","2026-03-28T03:03:18.899465+00:00",{"id":104,"slug":105,"title":106,"created_at":107},"a53571ad-735a-4178-9f93-cb09b699d99c","vega-driving-language-instructions-en","Vega: Driving with Natural Language Instructions","2026-03-28T14:54:04.698882+00:00",{"id":109,"slug":110,"title":111,"created_at":112},"a34581d6-f36e-46da-88bb-582fb3e7425c","personalizing-autonomous-driving-styles-en","Drive My Way: Personalizing Autonomous Driving Styles","2026-03-28T14:54:26.148181+00:00",{"id":114,"slug":115,"title":116,"created_at":117},"2bc1ad7f-26ce-4f02-9885-803b35fd229d","training-knowledge-bases-writeback-rag-en","Training Knowledge Bases with WriteBack-RAG","2026-03-28T14:54:45.643433+00:00",{"id":119,"slug":120,"title":121,"created_at":122},"71adc507-3c54-4605-bbe2-c966acd6187e","packforcing-long-video-generation-en","PackForcing: Efficient Long-Video Generation Method","2026-03-28T14:55:02.646943+00:00",{"id":124,"slug":125,"title":126,"created_at":127},"675942ef-b9ec-4c5f-a997-381250b6eacb","pixelsmile-facial-expression-editing-en","PixelSmile Framework Enhances Facial Expression Editing","2026-03-28T14:55:20.633463+00:00",{"id":129,"slug":130,"title":131,"created_at":132},"6954fa2b-8b66-4839-884b-e46f89fa1bc3","adaptive-block-scaled-data-types-en","IF4: Smarter 4-Bit Quantization That Adapts to Your Data","2026-03-31T06:00:36.65963+00:00"]