土地的偏最小二乘回归方法外文翻译资料

 2022-08-28 02:08

A partial least-squares regression approach to land

use studies in the Suzhou-Wuxi-Changzhou region

ZHANG Yang1,2, ZHOU Chenghu1, ZHANG Yongmin3

1 Introduction

Empirical statistical analysis has been proposed for describing land use in quantitative terms and for testing the importance of its influencing factors.(Turner II et al., 1995; Hoshino,1996; Veldkamp and Fresco, 1997; Verburg and Chen, 2000). Various statistical methods, such as multiple linear regressions, canonical correlation analysis and principal component analysis, have been adopted in these studies. They aim at a relatively limited time scale, but are especially useful in regions that are still considerably restricted by biophysical and socio-economic conditions (Hoshino, 1996; Veldkamp and Fresco, 1997; De Koning et al., 1998). However, a problem in applying conventional methods to land use studies lies in their inability to deal with multicollinearity existed in land use types, biophysical and socio-economic indicators,

especially in some cases with few observations. Remaining multicollinearity and limited data availability often cause indirect relationships between dependent

and independent variables (Gardner, 1998).

Multicollinearity is defined as the existence of nearly linear dependency among the dependent and independent variables. In the solution of the following ordinary least-squares regression equation:

Ycirc; = X (X X )minus;1 X Y

where Ycirc; will be a good estimator of Y. However, multicollinearity existed in variables and inadequate number of observations will make this estimator invalidated.

The existence of multicollinearity may result in wide confidence intervals for individual parameters (unstable estimates), may give estimates with wrong signs and may affect our decision in a hypothesis testing. Severe multicollinearity may make the estimates so unstable that they are practically useless (Fikri and Fikri, 2000).

Partial least-squares (PLS) regression method was developed in the 1970s by Herman O.A. Wold (Wold, 1975; Wold et al., 1984). It is a statistical tool that has been specifically designed to deal with multiple regression problems where the number of observations is limited and correlations between variables are high. PLS regression has gained a great success in scientific fields, such as chemometrics (Geladi and Kowalski, 1986; Houml;skuldsson,1988). In technical terms, PLS aims at producing a model that transforms a set of correlated explanatory variables into a new set of uncorrelated variables, called PLS factors in this paper. They capture most of the information for the independent variables that is useful for explaining and predicting the dependent variables. In the meantime, PLS reduces the dimensionality of the regression by using fewer PLS factors than the number of independent variables.

The objectives of this paper are to summarize our initial attempts to develop a PLS regressionapproach on land use research through a case study of the Suzhou-Wuxi-Changzhou region in China. These encompass efforts to gain a thorough understanding of the study area and the research data, the underlying relevance of biophysical and socio-economic factors to land use, and make clear the potential of the PLS regression approach to regional land use analysis. The paper quantifies and accounts for static relations between land use and its influencing factors.

3 Materials and methods

3.1 Data

Land use data that accurately and reliably reflect the land use pattern of the Suzhou-Wuxi-Changzhou region were adopted in the land resources investigation project in 1996 (Lin and Ho, 2003). Biophysical and socio-economic data were collected from the statistical yearbooks of Suzhou, Wuxi and Changzhou in 1996, consistent with the investigation time of land use data. “County” was chosen as the unit of sampling, for the homogeneity of biophysical, socio-economic and land use data. This led to the limited number of observations. Fifteen counties that belong to the region were taken as observation samples. The number is much less than that of included variables. All data were mean centered and scaled to unit variance beforehand, which gave them all the same weight in the analyses.

3.2 Variables

Table 1 shows the land use types and potentially influencing factors used in the analysis. According to land use data used in the study, eight land use types were taken into account: cultivated land, garden land, forest, grassland, built-up land, transport land, water area and unused land. Factors that potentially influenced the structure of land use were selected according to literature review and the knowledge of the specific situation in the region. Five climatic indicators were included in the biophysical factors. Twenty major socio-economic indicators were included, relating to demography, macroeconomic, industry, agriculture , transportation, and peoplersquo;s livelihood.

3.3 Partial least-squares (PLS) regression

The PLS model has the form

X = TPE

Y = UQF

where X and Y are the matrices of explanatory variables and response variables. The matrices of this model are defined by T = X_ score, U = Y_ score, P = X_ loading, Q = Y_ loading, E = X_ residual, and F = Y_ residual. PLS algorithms choose successive orthogonal factors (PLS factors) that maximize the covariance between each T and the corresponding U. Generally, these factors stand for variance information of X and Y as much as possible. At the same time, they derive a useful relation between X and Y. For a good PLS model, the first few factors show a high correlation between T and U. To ob

剩余内容已隐藏,支付完成后下载完整资料


英语译文共 4 页,剩余内容已隐藏,支付完成后下载完整资料


资料编号:[404901],资料为PDF文档或Word文档,PDF文档可免费转换为Word

原文和译文剩余内容已隐藏,您需要先支付 30元 才能查看原文和译文全部内容!立即支付

以上是毕业论文外文翻译,课题毕业论文、任务书、文献综述、开题报告、程序设计、图纸设计等资料可联系客服协助查找。