Conferencistas – XIX EMR – XIX Escola de Modelos de Regressão-XIX EMR e V Encontro Paraibano de Estatística-V EPBEST

Conferencistas - XIX EMR

Gauss Moutinho Cordeiro

UFPE

Gauss Cordeiro

TÍTULO: A new flexible regression model with application to recovery probability Covid-19 patients

RESUMO: We propose a generalized odd log-logistic Maxwell mixture model to analyze the effect of gender and age groups on lifetimes and on the recovery probabilities of Chinese individuals with COVID-19. We add new properties of the generalized Maxwell model. The coefficients of the regression and the recovered fraction are estimated by maximum likelihood. Our findings suggest that the proposed model could be a good alternative to analyze censored lifetime of individuals with COVID-19.

MINI CURRÍCULO: Gauss Cordeiro tem mestrado em Pesquisa Operacional pela COPPE/UFRJ e doutorado em Estatística pelo Imperial College, Londres. Segundo o research.com, ele tem 538 artigos publicados e quase 18 milhões de consultas. Orientou 59 alunos de mestrado e doutorado. É professor emérito da UFPE e membro da Academia Brasileira de Ciências.

Raffaele Argiento

Università degli Studi di Bergamo, Itália

Raffaele Argiento

TÍTULO: Model-Based Clustering: A Bayesian Nonparametric Perspective

RESUMO: Mixture models are the prototypical tool for clustering. In the Bayesian framework, a traditional distinction has been made between mixtures with a finite number of components and those with an infinite number. Recently, however, the literature has increasingly recognized that both types of mixture models can be addressed using a unified set of tools — indeed, the clustering problem is inherently nonparametric in nature. Adopting this perspective, in this talk I will first present a general definition of a mixture model with a random number of components, framing it within a nonparametric context and highlighting how its probabilistic structure is naturally linked to point process theory. Building on this connection, I will show how point process theory can be leveraged to construct repulsive or attractive mixtures. I will illustrate these ideas with an example where the mixture model is designed to cluster categorical data, emphasizing how dependence between the atoms (i.e., attractive interactions) can be exploited to study dependence structures in categorical datasets.

MINI CURRÍCULO: Raffaele Argiento is a Full Professor of Statistics at the Department of Economics of the University of Bergamo. His research focuses on the development of Bayesian methods for analyzing complex data, with particular attention to mixture models in a non-parametric context. His work has also been applied to problems in biology, sports analytics, and ecology. He has contributed to several competitively funded research projects, both as principal investigator and co-investigator. He is an Associate Editor of Statistics and Computing and a regular visiting scholar at the National University of Singapore. Raffaele is actively engaged in promoting and disseminating Bayesian statistics through the organization of conferences, workshops, and other initiatives, as a member of the International Society for Bayesian Analysis (ISBA) and the Italian Statistical Society (SIS). More details about his activities and publications are available on his website: www.raffaeleargiento.it.

Alan Hepburn Welsh

Australian National University

Alan Welsh

TÍTULO: Double descent and noise in fitting linear regression models

RESUMO: “Double descent” is used in statistical machine learning to describe the fact that models with more parameters than observations can have better predictive performance (as measured by the test error) than models with fewer parameters than observations. This challenge to the belief that simpler models are generally better means we need a rethink of fundamental statistical ideas. We explore the effects of including noise predictors and noise observations when fitting linear regression models. We present empirical and theoretical results that show that double descent occurs in both cases, albeit with contradictory implications: the implication for noise predictors is that complex models are often better than simple ones, while the implication for noise observations is that relatively simple models are often better than complex ones. That is, double descent is not just a high-dimensional big data/machine learning phenomenon but can also occur in small datasets fitted with simple statistical models. We resolve this contradiction by showing that it is not the model complexity but rather the implicit shrinkage by the inclusion of noise in the model that drives the double descent. We also show that including noise observations in the model makes the (usually unbiased) ordinary least squares estimator biased and indicates that the ridge regression estimator may need a negative ridge parameter to avoid over-shrinkage.

MINI CURRÍCULO: Alan Welsh é Professor Titular de Estatística da Australian National University e membro da Academia Australiana de Ciências, do Instituto de Estatística Matemática, da Associação Americana de Estatística e da Sociedade Matemática Australiana. Recebeu vários prêmios por suas pesquisas sobre inferência estatística, modelagem estatística e robustez (especialmente em modelos mistos e seleção de modelos para modelos mistos) e análise de pesquisas por amostragem. Atualmente, desenvolve métodos para analisar dados correlacionados e de alta dimensão, explorando a relação entre redução de dimensão suficiente e seleção de modelos. Sua pesquisa tem sido financiada por vários órgãos governamentais e industriais, incluindo o Conselho Australiano de Pesquisa, que lhe concedeu cinco Projetos de Descoberta, três Projetos de Ligação e uma Grande Bolsa de Pesquisa. Seu trabalho foi publicado em importantes veículos de comunicação, incluindo Annals of Statistics, Journal of the American Statistical Association, Journal of the Royal Statistical Society Series B, Biometrika, Biometrics, Statistica Sinica e Econometric Theory. Também publicou artigos na American Naturalist, Biological Conservation, PloS One, Ecological Modelling, Wildlife Research, Chemometrics and Intelligent Laboratory Systems, Journal of Chemical Information and Modelling e Journal of Hydrology.

Flávio Bambirra Gonçalves

UFMG

Flávio Bambirra Gonçalves

TÍTULO: The Poisson-Gaussian Mixture Process: A Flexible and Robust Approach for Non-Gaussian Geostatistical Modeling.

RESUMO: In this talk, I will introduce a novel family of geostatistical models designed to capture complex features beyond the reach of traditional Gaussian processes. The proposed family, termed the Poisson-Gaussian Mixture Process (POGAMP), is hierarchically specified, combining the infinite-dimensional dynamics of Gaussian processes with any multivariate continuous distribution. This combination is stochastically defined by a latent Poisson process, allowing the POGAMP to define valid processes with finite-dimensional distributions that can approximate any continuous distribution. Unlike other non-Gaussian geostatistical models that may fail to ensure validity of the processes by assigning arbitrary finite-dimensional distributions, the POGAMP preserves essential probabilistic properties crucial for both modeling and inference. Formal results regarding the existence and properties of the POGAMP are established, highlighting its robustness and flexibility in capturing complex spatial patterns. To support practical applications, a carefully designed MCMC algorithm is developed for Bayesian inference when the POGAMP is discretely observed over some spatial domain. Extensive simulations evaluate the modeling strengths of the POGAMP and the efficiency of the MCMC algorithm, with real data analyses showcasing the advantages of the proposed methodology in applied contexts.

MINI CURRÍCULO: Flávio Gonçalves earned his Ph.D. in Statistics from the University of Warwick in 2011 and is now a Professor of Statistics at the Federal University of Minas Gerais. His work focuses on computational statistics and Monte Carlo methods, Bayesian modeling and inference for infinite-dimensional problems—including geostatistics and models driven by stochastic differential equations—alongside broader interests in mathematical statistics. His recent articles have been published in leading journals such as Journal of the Royal Statistical Society: Series B, Journal of the American Statistical Association, and Biometrika, reflecting both methodological advances and substantive applications of his research.

Paulo Orenstein

IMPA

Paulo Orenstein

TÍTULO: Split Conformal Prediction and Extensions to Non-exchangeable Data

RESUMO: Machine learning algorithms offer state-of-the-art predictive performance in a variety of domains, but often lack an associated measure of uncertainty regarding its predictions. Split conformal prediction is a leading tool to obtain predictive intervals with virtually no assumptions beyond data exchangeability. This crucial assumption, however, hinders its applicability to many important data, such as time series and spatially dependent processes. In this talk, we will introduce split CP and show how it can be extended to non-exchangeable settings through a small coverage penalty. The proposed framework, based on data decoupling and concentration of measure inequalities, works more generally than traditional split CP, and experiments corroborate our coverage guarantees even under highly dependent data. This is joint work with Roberto Imbuzeiro Oliveira, Thiago Ramos and João Vitor Romano.

MINI CURRÍCULO: Paulo Orenstein is an associate professor at IMPA. His research areas include statistics and machine learning, both from a theoretical and a practical point of view. His theoretical interests span robust estimators, non-convex optimization and scalable Bayesian methods. From a practical standpoint, Paulo has experience working with weather prediction, deforestation monitoring, and medical data. He received his Ph.D. in Statistics from Stanford University. Previously, he obtained his B.S. and M.S. degrees from PUC-Rio, in Brazil.

Taiane Schaedler Prass

UFRGS

Taiane Schaedler Prass

TÍTULO: PRTree: mais flexibilidade, mais incerteza, melhor desempenho?

RESUMO: Apresentaremos uma investigação sobre o uso de árvores de decisão probabilísticas (PRTree – Probabilistic Regression Trees) como alternativa ao método tradicional CART em diferentes contextos da modelagem estatística, com foco em três situações centrais: previsão, imputação de valores faltantes e estimação de parâmetros a partir de dados imputados. As PRTrees generalizam árvores de decisão tradicionais ao substituir divisões rígidas por funções de probabilidade, produzindo respostas suaves e contínuas. Essa característica faz com que o método apresente um desempenho superior em tarefas de predição quando a resposta e as covariáveis apresentam uma relação mais suave.

Também será discutida uma adaptação do PRTree para cenários em que as covariáveis podem estar parcialmente observadas. A formulação resultante permite decisões mais flexíveis e evita a exclusão de observações com informações incompletas. Três estratégias distintas são abordadas: (i) atribuição uniforme de probabilidade, (ii) condicionamento parcial às regiões compatíveis com os vetores parcialmente observados, e (iii) projeção suavizada nas dimensões observadas.

Exploramos ainda a aplicação do PRTree em séries temporais, onde as covariáveis podem incluir defasagens da própria série, isoladamente ou em combinação com outras variáveis explicativas. Essa abordagem permite incorporar a estrutura temporal do processo à árvore de decisão, ampliando seu uso em contextos dinâmicos e potencialmente não lineares, sem requerer modelos explícitos da dinâmica do processo.

Discutiremos os resultados da avaliação do método, com base em três estudos complementares: o primeiro compara o desempenho preditivo dos métodos no contexto de dados completos; o segundo analisa a acurácia na imputação dos dados, no contexto de séries temporais; e o terceiro investiga se os métodos que melhor imputam os dados também oferecem estimativas mais precisas de parâmetros de modelo ou medidas de dependência.

MINI CURRÍCULO: Possui graduação em Matemática (Licenciatura Plena) pela Universidade Federal de Santa Maria (2006), mestrado e doutorado em Matemática pela Universidade Federal do Rio Grande do Sul (2008 e 2013), com ênfase em Probabilidade e Estatística Matemática. Atuou como cientista de dados na empresa StatSoft South America (2014–2015). Atualmente é professora do Departamento de Estatística da UFRGS e membro permanente do Programa de Pós-Graduação em Estatística (PPGEst) da mesma instituição. Tem experiência em Estatística Matemática, com foco em séries temporais, processos estocásticos não lineares, modelos com longa dependência e imputação de dados faltantes. É a principal desenvolvedora dos pacotes PPMiss, BTSR, PTSR e DCCA para R, além de colaborar no desenvolvimento do pacote PRTree. Entre os auxílios de pesquisa recebidos destacam-se: auxílio recém-doutor (FAPERGS, 2017), auxílio recém-contratado (UFRGS, 2019) e bolsa de produtividade do CNPq (Chamada nº 18/2024).

Marcos Oliveira Prates

UFMG

Marcos Oliveira Prates

TÍTULO: Advances in Spatial Statistics for Large-Scale and Complex Domains

RESUMO: The proliferation of large-scale geospatial data from sources such as satellite remote sensing and cellular phone networks has created a need for new statistical methods capable of handling massive datasets and complex spatial domains, as classical techniques often face prohibitive computational burdens and restrictive assumptions. In this talk, I discuss recent advances that directly address some of these challenges, primarily through the development of a scalable model that reduces computational complexity from cubic to near-linear in the number of observations. Further, we explore some of its applications. Beyond scalability, progress has been made in tailoring methods for complex domains by defining a process using appropriate distance metrics. The synthesis of these scalable and geometrically aware methods empowers practitioners to extract meaningful insights from vast and intricate spatial data. Again, we revisit applications in other spatial domains. FAPEMIG and CNPq partially funded these works.

This is a joint work with Carlos Gonzáles, Dipak K. Dey, Harvard Rue, Heitor Ramos, Lucas Godoy, Lucas Michelin, Jun Yan, and Zaida Quiroz.

MINI CURRÍCULO: Marcos Prates obteve seu bacharelado em Matemática Computacional pela Universidade Federal de Minas Gerais (UFMG) em 2006 e seu mestrado em Estatística em 2008 pela mesma instituição. Em 2011, recebeu seu doutorado em Estatística pela Universidade de Connecticut e foi professor visitante na mesma universidade entre 2019 e 2020. Atualmente, é professor associado da UFMG. Suas principais áreas de pesquisa são Estatística Bayesiana, Modelos Mistos Lineares Generalizados, Aprendizado de Máquina e Estatística Espacial. Também foi Coordenador do Programa de Pós-Graduação em Estatística da UFMG (2016-2018), Secretário (2015-2016) e Tesoureiro (2023-2024) da ISBRA, o capítulo brasileiro da ISBA, e foi Presidente da Associação Brasileira de Estatística (ABE) (2020-2022). Atualmente é o Coordenador do bacharelado em Ciências de Dados da UFMG.