Generating fake data

1 reply [Last post]
ebejer's picture
Joined: 03/18/2010


I have some data that I want to replicate (to maintain realistic relationships between the variables). I have copied the script that Ryne posted for everyone's convenience, and I was able to generate one data set. However the data I want does contain factors (unordered?), so it seems to me I need to use the second half of the script to retain the information that they have.

So, I have been trying to run that portion of the script, which I was able to do without errors, though it's not actually working (no data is generated). Does anyone have some suggestions about how I should modify the script (attached) to replicate the information I'm after?

Best wishes

repData.R2.87 KB
Ryne's picture
Joined: 07/31/2009
The issue I see is that you

The issue I see is that you use the variable 'row' to define how many observations to pull. However, you deleted the line of code that defined 'row' as the number of rows in your data, so 'row' gets treated as numeric(0). Thus, you generate zero rows of data in the rmvnorm call, and fakefac contains a data.frame with zero rows.

More generally, I think fakeData needs a rewrite. I've received a few too many error reports with ordinal data, generally finding that hetcor says a lot of existing datasets have non-positive definite correlation matrices when the original data runs fine. If anyone has any feature suggestions for fakeData 1.1/2.0, let me know.