Plot scatter plot when values are wrongly paired
我正在尝试根据我使用 dplyr 的 spread() 函数创建的数据框创建一些相关图。当我使用扩展函数时,它在新数据框中创建了 NA。这是有道理的,因为数据框在不同时间段具有不同参数的浓度值。
以下是原始数据框的示例截图:
当我使用扩展函数时,它给了我一个像这样的数据框(示例数据):
1
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
structure(list(orgid = c(“11NPSWRD”,”11NPSWRD”,”11NPSWRD”,
“11NPSWRD”,”11NPSWRD”,”11NPSWRD”,”11NPSWRD”,”11NPSWRD”,”11NPSWRD”, “11NPSWRD”,”11NPSWRD”,”11NPSWRD”,”11NPSWRD”,”11NPSWRD”,”11NPSWRD”, “11NPSWRD”,”11NPSWRD”,”11NPSWRD”,”11NPSWRD”,”11NPSWRD”), locid = c(“11NPSWRD-MORR_NPS_PR2″,”11NPSWRD-MORR_NPS_PR2”, “11NPSWRD-MORR_NPS_PR2″,”11NPSWRD-MORR_NPS_PR2″,”11NPSWRD-MORR_NPS_PR2”, “11NPSWRD-MORR_NPS_PR2″,”11NPSWRD-MORR_NPS_PR2″,”11NPSWRD-MORR_NPS_PR2”, “11NPSWRD-MORR_NPS_PR2″,”11NPSWRD-MORR_NPS_PR2″,”11NPSWRD-MORR_NPS_PR2”, “11NPSWRD-MORR_NPS_PR2″,”11NPSWRD-MORR_NPS_PR2″,”11NPSWRD-MORR_NPS_PR2”, “11NPSWRD-MORR_NPS_PR2″,”11NPSWRD-MORR_NPS_PR2″,”11NPSWRD-MORR_NPS_PR2”, “11NPSWRD-MORR_NPS_PR2″,”11NPSWRD-MORR_NPS_PR2″,”11NPSWRD-MORR_NPS_PR2” ), stdate = structure(c(9891, 9891, 9891, 9920, 9920, 9920, 9949, 9949, 9949, 9978, 9978, 9978, 10011, 10011, 10011, 10067, 10067, 10073, 10073, 10073), class =”Date”), sttime = structure(c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), class = c(“hms”, “difftime”), units =”secs”), valunit = c(“uS/cm”,”mg/l”, “mg/l”,”uS/cm”,”mg/l”,”mg/l”,”uS/cm”,”mg/l”,”mg/l”, “uS/cm”,”mg/l”,”mg/l”,”uS/cm”,”mg/l”,”mg/l”,”uS/cm”, “mg/l”,”uS/cm”,”mg/l”,”mg/l”), swqs = c(“FW2-TP”,”FW2-TP”, “FW2-TP”,”FW2-TP”,”FW2-TP”,”FW2-TP”,”FW2-TP”,”FW2-TP”, “FW2-TP”,”FW2-TP”,”FW2-TP”,”FW2-TP”,”FW2-TP”,”FW2-TP”, “FW2-TP”,”FW2-TP”,”FW2-TP”,”FW2-TP”,”FW2-TP”,”FW2-TP” ), WMA = c(6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L), year = c(1997L, 1997L, 1997L, 1997L, 1997L, 1997L, 1997L, 1997L, 1997L, 1997L, 1997L, 1997L, 1997L, 1997L, 1997L, 1997L, 1997L, 1997L, 1997L, 1997L), Chloride = c(NA, 35, NA, NA, 45, NA, NA, 30, NA, NA, 30, NA, NA, 30, NA, NA, NA, NA, 35, NA), `Specific conductance` = c(224, NA, NA, 248, NA, NA, 204, NA, NA, 166, NA, NA, 189, NA, NA, 119, NA, 194, NA, NA), `Total dissolved solids` = c(NA, NA, 101, NA, NA, 115, NA, NA, 96, NA, NA, 79, NA, NA, 89, NA, 56, NA, NA, 92)), .Names = c(“orgid”,”locid”,”stdate”, “sttime”,”valunit”,”swqs”,”WMA”,”year”,”Chloride”,”Specific conductance”, “Total dissolved solids”), row.names = c(NA, 20L), class =”data.frame”) |
我遇到的问题是,当我尝试创建相关图时,它给了我一个只有一个点的图。我猜这是因为数据框中有 NA。但是当我尝试过滤 NA,它为我提供了一个具有 0 个观察值的数据框。任何帮助将不胜感激!
创建相关图的示例代码:
1
2 3 |
plot1<-ggplot(data=df,aes(x=”Specific conductance”,y=”Chloride”))+
geom_smooth(method =”lm”, se=FALSE, color=”black”, formula = y ~ x)+ geom_point() |
我想创建一个这样的情节:
- 从 aes(x=”Specific conductance”,y=”Chloride”) 中删除引号。由于列名中有空格,请使用: aes(x=`Specific conductance`,y=Chloride)
- @PoGibas 当我这样做时,我得到了这个->错误:无法将 ggproto 对象添加在一起。您是否忘记将此对象添加到 ggplot 对象?
- 正如您所提到的,您的数据格式很奇怪,因为它只是与 NA 配对的数值。
您需要删除 NA
- 谢谢董老师的回答!你摇滚!
- 顺便说一句,如果你想分离方程
快速而肮脏的解决方案是修改您已有的数据。通过特定列将其与自身合并,并在两个值都不是 NA.
的情况下保留匹配项
1
2 3 4 5 6 7 8 9 10 11 12 13 |
# Merge data with itself
# Here I’m only guessing columns that need to match between # Conductance and Chloride df2 <- merge(df, df, c(“orgid”,”locid”,”stdate”)) # This will give table with multiple duplicate rows (all possible combinations) # Select only those combinations where both values are not NA # Plot |
- @KWANGER 按这些列绑定数据,以便您获得与数值配对的数值。
来源:https://www.codenong.com/52261892/