Skip to main content

Posts

SEM Sample Size Calculator

SEM Sample Size Calculator SEM Sample Size Calculator Estimate the minimum sample size for your Structural Equation Modelling (SEM) project using rule-based and literature-backed methods. Choose SEM Method: Covariance-Based SEM (CB-SEM) Partial Least Squares SEM (PLS-SEM) Number of Predictors: Estimated Number of Indicators: Estimated Parameters to Estimate: Calculate 📌 Example 1: CB-SEM Scenario: 4 latent variables, each with 5 indicators → 20 indicators. Estimated 30 model parameters. Steps: Choose CB-SEM, enter Predictors: 4, Indicators: 20, Parameters: 30. Result: 300 samples (Kline, 2016) 📌 Example 2: PLS-SEM Scenario: Most complex latent variable has 6 predictors. Total 18 indicators. Steps: Choose PLS-SEM, enter Predictors: 6, Indicators: 18, Parameters: 30. Result: 100 samples minimum (Hair et al.,...
Recent posts

Single Proportion Sample Size Calculator

📊 Single Proportion Sample Size Calculator This calculator helps researchers determine the minimum sample size required to estimate a single population proportion with a specified confidence level and margin of error. Estimated Proportion ( p ): Margin of Error ( d ): Confidence Level (Z-score): 95% (Z = 1.96) 90% (Z = 1.64) 99% (Z = 2.58) Calculate Sample Size Minimum required sample size ( n ): 🧮 How to Use This Calculator To use this calculator: Enter the estimated population proportion (e.g. 0.5 if unknown). Enter the desired margin of error (e.g. 0.05 for ±5%). Select your desired confidence level (usually 95%). Click "Calculate Sample Size". This is commonly used in health, education, and social science research when estimating the prevalence or proportion of a characteristic in a population (e.g. % of smokers, % with a disease). 📚 Reference Lwanga, S.K. & Lemeshow, S. (1991). Sample Size Determination in Health...

How SEM Bridges Theory and Measurement in Social Science Research

How SEM Bridges Theory and Measurement in Social Science Research How SEM Bridges Theory and Measurement in Social Science Research As a seasoned researcher, I have seen firsthand how Structural Equation Modeling, or SEM, transforms the way we approach studies in the social sciences. It brings together measurement and theory in a unified statistical framework that truly goes beyond what traditional regressions or factor analyses can offer. What Makes SEM Unique? You can think of SEM as a combination of two powerful tools. First, there is confirmatory factor analysis, which helps us measure hidden constructs. Then, there is path analysis, which allows us to map out relationships between variables. By blending these approaches, SEM helps us understand not just what we measure, but how those concepts connect and interact in real life. More details Overview on ResearchGate Why Do Social Scientists Prefer SEM? Modeling...

Case-Control Sample Size Calculator (based on odds ratio and exposure rate in controls)

Case-Control Sample Size Calculator Proportion exposed in controls ( p 0 ): Expected Odds Ratio ( OR ): Significance level (α): 0.05 (Z α/2 = 1.96) 0.10 (Z α/2 = 1.64) Power (1 − β): 80% (Z β = 0.84) 90% (Z β = 1.28) Control-to-Case Ratio ( r ): Calculate Estimated sample size: Cases ( n 1 ): Controls ( n 0 ): 📘 About This Calculator This calculator estimates the required sample size for a case-control study , based on the expected odds ratio (OR) , control exposure proportion ( p 0 ), significance level (α), desired power (1−β), and case:control ratio ( r ). Formula used: n 1 = [(Z α/2 + Z β )² × (r + 1) × p̄(1 − p̄)] / [r × (ln(OR))²] n 1 = number of cases n 0 = number of controls = r × n 1 p̄ = (p 0 + p 1 ) / 2 p 1 = (OR × p 0 ) / [1 + p 0 (OR − 1)] 📚 Reference Kelsey JL, Whittemore AS, Evans AS, Thompson WD. Methods in Observational Epidemiology . 2nd ed. New York: Oxford University Press; 1996. See Chap...

Krejcie & Morgan sample size calculator

Krejcie & Morgan Sample Size Calculator Enter Population Size (N): Calculate Sample Size Recommended Sample Size (n): 📘 About This Calculator This calculator uses the Krejcie & Morgan (1970) formula to estimate the minimum sample size required when the total population size is known. It is commonly used in social sciences, education, and health research. The formula is: n = (X² × N × P × (1 − P)) / (d² × (N − 1) + X² × P × (1 − P)) X² = 3.841 (for 95% confidence level) P = 0.5 (maximum variability) d = 0.05 (±5% precision) 📚 Citation Krejcie, R.V., & Morgan, D.W. (1970). Determining Sample Size for Research Activities . Educational and Psychological Measurement, 30 (3), 607–610. https://doi.org/10.1177/001316447003000308

Sample Size Calculation for Questionnaires using R

  To calculate sample size for a questionnaire study, you will need to consider a few factors: The type of statistical test you will be using: Different statistical tests have different assumptions and requirements, so you will need to choose a test that is appropriate for your data and research question. The desired level of precision: The sample size should be large enough to provide the desired level of precision in your estimates. For example, if you want to be able to detect small differences between groups, you will need a larger sample size than if you are only interested in detecting large differences. The expected response rate: The sample size should be large enough to account for the expected response rate. If you expect a low response rate, you will need a larger sample size to ensure that you have sufficient data for your analysis. The population size: If the population is small, you may need a larger sample size to ensure that your sample is representative of the popu...

Contoh Mudah menggunakan R bagi analisa Spanova

Berikut adalah contoh tabel data yang dapat digunakan dalam analisis Split-plot ANOVA tersebut:       Dalam contoh ini, pembolehubah tidak bersandar (independent variable) yang terikat pada sampel utama adalah jenis penyakit (Penyakit Jantung atau Penyakit kencing manis), sementara variabel independen yang terikat pada sampel subplot adalah dos ubat (1 mg atau 5 mg). Pembolehubah bersandar (dependent variable) adalah keberkesanan ubat, yang diukur dengan skala 0-1.   Berikut adalah contoh analisis Split-plot ANOVA menggunakan R, dengan menggunakan data hipotetis tentang keberkesanan suatu ubat baru pada pesakit dengan berbagai jenis penyakit:   # Memuat library yang diperlukan library(ez) # Memuat data data <- read.csv("data_ubat.csv") # Menampilkan struktur data str(data) # Menjalankan analisis Split-plot ANOVA aov_result <- ezANOVA(data = data,                   ...

Analisa Split-plot ANOVA (Spanova)

Split-plot ANOVA adalah kaedah statistik yang digunakan untuk menguji hipotesis tentang perbezaan yang signifikan antara rata-rata dari beberapa sampel yang terpisah, dengan menggunakan lebih dari satu pembolehubah tidak bersandar. Split-plot ANOVA merupakan versi dari ANOVA (Analysis of Variance) yang mengelompokkan sampel menjadi dua kategori: sampel utama (main plot) dan sampel subplot. Split-plot ANOVA digunakan dalam situasi di mana salah satu pembolehubah tidak bersandar terikat pada sampel utama, sementara pembolehubah tidak bersandar lainnya terikat pada sampel subplot. Misalnya, jika Anda ingin menguji apakah terdapat perbezaan yang signifikan antara rata-rata keuntungan dari beberapa perusahaan yang berbeza, maka pembolehubah tidak bersandar yang terikat pada sampel utama mungkin adalah jenis perusahaan (misalnya, perusahaan manufaktur, perusahaan jasa, dll.), sementara pembolehubah tidak bersandar yang terikat pada sampel subplot mungkin adalah jenis produk atau jasa yang di...

Mann-Whitney U and the Kruskal-Wallis tests

       Mann-Whitney U and Kruskal-Wallis are nonparametric statistical tests that can be used to compare two or more groups of data, respectively. These tests are often used when the data does not meet the assumptions of parametric tests, such as the assumption of normality.      In order to use these tests in SPSS (a statistical software package), you will need to have data that meets the following requirements: Mann-Whitney U: This test requires two independent groups of data. The groups should be independent in the sense that the members of one group are not related to the members of the other group. Kruskal-Wallis: This test requires at least three independent groups of data. As with the Mann-Whitney U test, the groups should be independent and the members of one group should not be related to the members of the other groups.      The objectives of these tests are to determine whether there are significant differences between the gro...

Data Standardization in Statistics

            Data standardization is a statistical method that is used to transform data so that it has a mean of zero and a standard deviation of one. This is often done to make the data more comparable or to simplify the analysis.           There are several ways to standardize data, but the most common method is to subtract the mean from each data point and then divide by the standard deviation. This results in a new set of values with a mean of zero and a standard deviation of one.           Standardization is useful when comparing data from different sources or when the data has different units of measurement. For example, if you want to compare the heights of people in two different countries, you could standardize the data by converting the heights to standard deviation units (also known as z-scores). This would allow you to compare the data on a common scale, regardless of the units of ...

Data Transformation Techniques

Data transformation is a technique that is used to convert the data from one form to another, typically to improve the normality of the data or to stabilize the variance. There are many techniques that can be used to transform data, including: Square root transformation: This transformation is used to normalize data that is skewed to the right (positive skewness). To apply this transformation, you take the square root of each data point. Log transformation: This transformation is used to normalize data that is skewed to the right (positive skewness) or has a long tail on the right side of the distribution. To apply this transformation, you take the natural log of each data point. Box-Cox transformation: This transformation is a family of transformations that can be used to normalize data that is skewed to the right (positive skewness) or skewed to the left (negative skewness). To apply this transformation, you need to specify a parameter, lambda, which determines the type of transforma...

Contoh-contoh Analisa yang memerlukan Data Bertaburan Normal

  Beberapa ujian statistik yang memerlukan data bertaburan normal (normal distribution) adalah: Uji t (Student's t-test) Uji t dua sampel (Two-Sample t-test) Uji t bebas (Independent t-test) Uji t terkait (Paired t-test) Uji F (ANOVA) Uji chi-kuadrat (Chi-Square Test) Uji regresi linier (Linear Regression) Ujian-ujian di atas memerlukan data yang bertaburan normal kerana andaian dasar dari ujian tersebut adalah bahawa data tersebut berasal dari taburan normal (normal distribution). Jika data tidak bertaburan normal, maka hasil ujian tersebut mungkin tidak tepat dan tidak dapat di jamin kualitinya. Oleh kerana itu, sebelum melakukan ujian di atas, penting untuk memeriksa apakah data yang akan digunakan memenuhi asumsi distribusi normal atau tidak. Jika data tidak bertaburan normal, maka kita dapat menggunakan transformasi data atau ujian yang lain yang tidak memerlukan asumsi distribusi normal, seperti ujian non-parametrik.

Type of Data Required for Properly Distributed Data

            A normally distributed dataset is one where the data follows a bell-shaped curve when plotted on a graph. Normal distribution is characterized by a mean, median, and mode that are all equal, and by a symmetrical distribution of data around the mean.           The data type needed for normally distributed data depends on the type of data being collected and the analysis you plan to perform. In general, numerical data (such as continuous variables or integers or ratio scale or interval scale) is more likely to be normally distributed than categorical data (such as nominal or ordinal variables).           If you are working with normally distributed data, you can use a variety of statistical techniques to analyze the data, including parametric tests (which assume that the data is normally distributed) and nonparametric tests (which do not assume a particular distribution).  ...

P-value in Statistics: What is it?

            In statistics, the p-value is a measure of the statistical significance of the results of a statistical test. It represents the probability that the observed results occurred by chance, given a certain hypothesis or null hypothesis.           The null hypothesis is a statement that assumes that there is no relationship between the variables being tested. For example, if you are testing the effectiveness of a new drug, the null hypothesis might be that the drug has no effect on the condition it is intended to treat.           The p-value helps you to determine whether the observed results are strong enough to reject the null hypothesis. If the p-value is low, it means that the observed results are unlikely to have occurred by chance, and you can reject the null hypothesis in favor of an alternative hypothesis (such as the hypothesis that the drug is effective). On the other hand, if...

The Solutions to Statistics' Multicollinearity

            Multicollinearity is a statistical phenomenon that occurs when two or more predictor variables in a regression model are highly correlated with each other. This can lead to unstable and inaccurate coefficient estimates, as well as difficulties in interpreting the results of the model. There are several ways to address multicollinearity in a statistical model: Remove one or more of the correlated predictor variables: This can help to reduce multicollinearity by reducing the number of correlated variables in the model. However, this may also reduce the explanatory power of the model. Combine correlated predictor variables into a single composite variable: This can help to reduce multicollinearity by reducing the number of correlated variables in the model. However, this may also reduce the interpretability of the model. Use regularization techniques: Regularization techniques, such as ridge regression or lasso, can help to reduce multicolline...

Determining The Sample Size for Linear Regression

            There are several factors that can influence the sample size required for a linear regression analysis, including the desired level of precision, the variability of the data, the number of predictor variables, and the desired level of statistical power.           One approach to calculating sample size for linear regression is to use a sample size calculator, which can be found online or in statistical software packages. These calculators typically allow you to specify the desired level of precision, the variability of the data, the number of predictor variables, and the desired level of statistical power, and they will provide an estimate of the sample size required to meet these criteria.           Another approach is to use a formula to calculate sample size based on the desired level of precision and the expected variability of the data. For example, the following formula can b...

How Do Partial Least Square Structural Equation Modeling and Covariance-based Structural Equation Modeling Vary from One Another?

            Covariance-based structural equation modeling (CB-SEM) and partial least squares structural equation modeling (PLS-SEM) are two methods for estimating structural equation models.           CB-SEM is a method for estimating and testing the relationships between observed variables and latent constructs in a model. It is based on the assumption that the observed variables are measured with error and that the relationships between the observed variables and latent constructs can be represented by a set of regression equations. CB-SEM estimates the model parameters by maximizing the likelihood of the data given the model.           PLS-SEM is a method for estimating and testing the relationships between observed variables and latent constructs in a model. It is based on the assumption that the observed variables are correlated with each other and with the latent constructs, and that the...

Structural Equation Modeling: A Quick Overview of the Lavaan Package in R

  Structural equation modeling (SEM) is a multivariate statistical technique that can be used to test and estimate relationships between observed variables and latent (unobserved) constructs. SEM allows you to test complex hypotheses about relationships between variables and can be used to test a variety of models, including confirmatory factor analysis, path analysis, and latent growth curve models. To apply SEM in R, you can use the lavaan package. This package provides a wide range of functions for estimating, modifying, and evaluating SEM models. Here is an example of how you can use lavaan to fit a SEM model in R: 1. Install and load the lavaan package: install.packages("lavaan") library(lavaan)   2. Specify the model using the lavaan syntax. The syntax consists of a series of statements that define the model, including the relationships between observed variables and latent constructs, the measurement models for each observed variable, and any constraints on the...