The Power, the Privacy, and the Ethical Balancing Act when Deploying LLMs at the Edge
And an invitation to join us for our expert panel discussion on April 17, 2025, at 15:00 to discuss all of this in detail.
Large Language Models (LLMs) are no longer just cloud-bound residents. We see them shrinking down, quietly taking up space in our phones, cars, and small smart devices like our doorbells. But what does it mean in practice when a language model doesn’t live in a data centre, but in your pocket?
This shift means that powerful AI models can now operate right at the “edge” of the network, closer to where data is generated and used.
There are several reasons why companies are pushing to bring LLMs to the edge.
First, there's the benefit of lower latency. With the model running locally, there's no need to send requests over the internet and wait for a server to respond; interactions become nearly instantaneous. Then there's offline functionality, which opens up AI-powered features in environments where connectivity is limited, spotty, or intentionally restricted. Perhaps most compelling is the promise of enhanced privacy, keeping user data on-device rather than transmitting it to the cloud. And of course, there's a cost incentive: reducing the dependency on large-scale cloud infrastructure helps companies scale AI features to millions of users without bearing the full burden of server-side computation.
However, privacy is not the same as security. While on-device models may reduce the need to send data to a centralised server, that doesn’t guarantee that the data is handled ethically, or even protected well. If local storage is poorly secured or if the model itself can be manipulated, sensitive information may still be at risk. Keeping data local may limit some forms of surveillance, but it also limits oversight.
And there is also the ethical landscape, which at the edge is far from simple. The issue of bias, for instance, doesn’t disappear just because a model is closer to the user. If anything, it becomes harder to detect and correct.
Edge models are often "frozen" after deployment, meaning any problematic behaviour baked into them—stereotypes, skewed perspectives, or cultural blind spots—can persist unchecked. Without centralised monitoring or ongoing evaluation, biased behaviour can go unnoticed for long stretches of time.
Another pressing concern is consent and autonomy. Most people understand that speaking to a chatbot on a website involves some form of AI interaction. But what about their smart fridge? Or their voice assistant that now runs an LLM locally? As these models become embedded into everyday devices, it's not always obvious when users are engaging with AI, or what the AI is doing with their data. Lack of transparency undermines informed consent and erodes trust.
Finally, we face a fundamental question of sustainability. On the surface, moving AI processing to the edge seems greener, offloading work from massive, energy-hungry data centres. But does this mean that edge AI is inherently sustainable?
Running LLMs locally can increase power consumption on billions of devices, from phones to smart appliances, many of which weren’t originally designed to handle such intensive workloads. This not only impacts battery life and hardware longevity but also contributes to the collective carbon footprint of AI. And with frequent updates or model swaps, we risk creating new forms of e-waste and energy redundancy.
So how do we manage this balance between challenges and advantages in practice? What are some other challenges technical teams face when deploying LLMs at the edge and what are some leadership strategies that can help overcome those?
To explore these complexities and showcase solutions from the field, we’ve brought together a panel of leading-edge researchers and practitioners who will share their experiences deploying LLMs beyond the cloud, across sectors like design, finance, telecom, and distributed systems, and we’re excited to invite you to the next expert-led session in CeADAR’s Lighthouse Projects Programme: “LLMs at the Edge: Challenges and Promises for Real-Time Intelligence”.
🗓️ Date: Thursday, April 17, 2025
🕒 Time: 15:00 – 16:30 IST (UTC+1)
📍 Location: Online (Zoom Webinar)
You can expect to hear a ton of real-world insights from panelists working on edge AI applications in diverse contexts, joined in a practical discussion on optimising LLMs for edge deployment: from compression to latency handling. Expect an exploration of ethical and security considerations at the edge and prepare for a robust Q&A and networking opportunities with an audience of professionals, researchers, and strategists.
Meet the Panelists:
Avinash Thakur – Principal AI Researcher, Monotype. Avinash leads innovation in Generative AI and deep learning for design technologies. His recent work focuses on deploying high-performance AI on edge devices, including image generation using Stable Diffusion, model compression, and inference acceleration. With over a decade of experience in applied AI across Monotype, OPlus India, and Samsung Research, Avinash brings deep expertise in mobile AI, computer vision, and NLP. He holds several patents and is a gold medallist from NIT Patna.
Dr. Hana Khamfroush – Associate Professor, University of Kentucky. Hana is an Associate Professor of Computer Science at the University of Kentucky and Director of the NetScience Research Lab. Her research spans Edge AI, Federated Learning, and scalable network optimization, with funding support from NASA, Cisco, and the NSF CAREER Award. A dedicated educator and advocate for women in computing, she also leads the ACM-W chapter at her university. Hana’s work advances distributed intelligence while promoting equity in STEM.
Jayeeta Putatunda – Director, AI Centre of Excellence, Fitch Group. A GenAI leader known for her scalable NLP solutions, Edge AI expertise, open-source exploration, and advocacy for women in tech. Her career bridges technical innovation and strategic leadership, earning her the AI100 award and recognition as one of the Top 25 Women in FinTech AI. An advocate for diversity in tech, Jayeeta is also NYC Chapter Lead for Women in AI and frequently speaks at leading AI forums, including ICML and ODSC.
This session is designed for:
AI engineers, data scientists, and technical professionals developing real-time or edge applications
AI researchers interested in model compression, distributed learning, and ethical deployment
Business and product strategists exploring practical and sustainable LLM applications in complex environments
Students and emerging professionals eager to learn from real-world experiences in the edge AI space
Register now!
Please note that this panel is part of a broader educational series on sustainable and ethical LLMs. The registration link will direct you to the registration page for the full series. You are very welcome to attend just this session and unsubscribe from future updates afterwards, or stay with us for the rest of the series. It’s completely free either way!
We look forward to seeing you there!