# Least Squares Estimation of Spatial Autoregressive Models for Large-Scale Social Networks

Due to the rapid development of various social networks, the spatial autoregressive (SAR) model is becoming an important tool in social network analysis. However, major bottlenecks remain in analyzing large-scale networks (e.g., Facebook has over 700 million active users), including computational scalability, estimation consistency, and proper network sampling.

To address these challenges, we propose a novel least squares estimator (LSE) for analyzing large sparse networks based on the SAR model. Computationally, the LSE is linear in the network size, making it scalable to analysis of huge networks. In theory, the LSE is $\sqrt{n}$-consistent and asymptotically normal under certain regularity conditions. A new LSE-based network sampling technique is further developed, which can automatically adjust autocorrelation between sampled and unsampled units and hence guarantee valid statistical inferences. Moreover, we generalize the LSE approach for the classical SAR model to more complex networks associated with multiple sources of social interaction effect. Numerical results for simulated and real data are presented to illustrate performance of the LSE.