当前位置：首页技术文章正文

关于 sql:hive 上下文无法识别 pyspark 中的临时表 – AnalysisException: ‘Table not found’ | 珊瑚贝

01-05 技术文章 150

hive context doesn’t recognize temp table in pyspark – AnalysisException: ‘Table not found’

我正在使用以本地模式运行的 pyspark (1.6.1)。
我有一个来自 csv 文件的数据框，我需要添加 dense_rank() 列。
我知道 sqlContext 不支持窗口功能，但 HiveContext 支持。

1
2
3
4
5

hiveContext = HiveContext(sc)
df.registerTempTable(“visits”)
visit_number = hiveContext.sql(“select store_number,”
“dense_rank() over(partition by store_number order by visit_date) visit_number”
“from visits”)

我收到错误消息：
AnalysisException: u’Table not found: 访问；

警告后：WARN ObjectStore: Failed to get database default, return NoSuchObjectException

在阅读了之前的问题后，我尝试将 conf/hive_defaults.xml 中的 ConnectionURL 更改为 hive 目录的确切位置，但没有成功。

有人知道这个问题吗？

谢谢！

显然，我所要做的就是删除我创建的 hiveContext 并将 sqlContext 从 SQLContext 切换到 HiveContext。在同一个 python 脚本中创建它们(sqlContext 和 hiveContext)是行不通的。问题重演，我没有在任何地方看到解决方案。希望它会帮助某人。

结果：
删除 SQLContext 并仅使用 HiveContext，一切正常。

SQLContext 是用来收集数据的，就像从 HiveContext 的 clase 中查看

你应该在 registerTempTable

之前创建 DataFrame

1	MyDataFrame <- READ.df(sqlContext, CsvPath, SOURCE =“somthing.csv”, header =“true”)

然后：

1	registerTempTable(MyDataFrame,“visits”)

df 已经是来自 csv 文件的数据框。 df = sqlContext.read.load(\\’df.csv\\’, 格式=\\’com.databricks.spark.csv\\’, header=\\’true\\’, inferSchema=\\’true\\’)

来源：https://www.codenong.com/36461119/

微信公众号

手机浏览(小程序)

Warning: get_headers(): SSL operation failed with code 1. OpenSSL Error messages: error:14090086:SSL routines:ssl3_get_server_certificate:certificate verify failed in /mydata/web/wwwshanhubei/web/wp-content/themes/shanhuke/single.php on line 57

Warning: get_headers(): Failed to enable crypto in /mydata/web/wwwshanhubei/web/wp-content/themes/shanhuke/single.php on line 57

Warning: get_headers(https://static.shanhubei.com/qrcode/qrcode_viewid_9522.jpg): failed to open stream: operation failed in /mydata/web/wwwshanhubei/web/wp-content/themes/shanhuke/single.php on line 57

0

分享到：