maxy

第 1156 位会员
注册于 2017-12-26 14:39:29
活跃于 2018-04-09 17:19:58


最近话题
最新评论
  • spark sortByKey ? at 2018-04-09 17:12:10

    恩,查了一下,确实是这样。这个函数给力了。

  • spark sortByKey ? at 2018-04-09 09:34:32

    JavaPairRDD<Integer,String> rdd2 = pairRDD.sortByKey(false,3);
    这行代码我把 pairRDD 在3个分区 排序,最后的结果 rdd2是全局的排序还是局部的排序(测试结果排序是全局的,但有个疑问:sort的时候不是在每个分区单独排序的吗? 最后的结果怎么是全局排序的效果?)

  • spark sortByKey ? at 2018-04-09 09:09:59

    是的,大大的赞

  • oracle sql 提数? at 2018-02-24 15:58:53

    嗯。是的!!

  • oracle sql 提数? at 2018-02-24 15:40:12

    问题已经解决了, 是用 pivot 函数; 哎,这中问题还是要专业的 dba做

  • spark ml ? at 2018-02-24 14:35:09

    hive分析? 用的什么方法?请大神指点一下, 我们这边的博士是用 python 的算法来做的,计算出每个特征的概率来帅选的。

  • spark-submit ? at 2018-02-02 16:54:59

    搞了好长时间了, 郁闷了!

  • spark-submit ? at 2018-02-02 15:27:04

    有啊, client 模式是可以的, 我打的是 依赖包

  • spark-submit ? at 2018-02-02 15:21:43

    18/02/02 15:20:25 INFO yarn.Client: Application report for application_1517482621865_0005 (state: ACCEPTED)
    18/02/02 15:20:26 INFO yarn.Client: Application report for application_1517482621865_0005 (state: ACCEPTED)
    18/02/02 15:20:27 INFO yarn.Client: Application report for application_1517482621865_0005 (state: FAILED)
    18/02/02 15:20:27 INFO yarn.Client:
    client token: N/A
    diagnostics: Application application_1517482621865_0005 failed 2 times due to AM Container for appattempt_1517482621865_0005_000002 exited with exitCode: 15
    For more detailed output, check application tracking page:http://master:8088/proxy/application_1517482621865_0005/Then, click on links to logs of each attempt.
    Diagnostics: Exception from container-launch.
    Container id: container_1517482621865_0005_02_000001
    Exit code: 15
    Stack trace: ExitCodeException exitCode=15:
    at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
    at org.apache.hadoop.util.Shell.run(Shell.java:455)
    at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
    at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:748)

    Container exited with a non-zero exit code 15
    Failing this attempt. Failing the application.
    ApplicationMaster host: N/A
    ApplicationMaster RPC port: -1
    queue: default
    start time: 1517555966590
    final status: FAILED
    tracking URL: http://master:8088/cluster/app/application_1517482621865_0005
    user: root
    Exception in thread "main" org.apache.spark.SparkException: Application application_1517482621865_0005 finished with failed status
    at org.apache.spark.deploy.yarn.Client.run(Client.scala:1029)
    at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1076)
    at org.apache.spark.deploy.yarn.Client.main(Client.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)

  • spark ml logisticRegression (逻辑回归) ? at 2018-01-31 12:12:17

    这个问题应该是 特征提取的不对, 不能直接用原始数据去计算。

    那还有个 问题, 对 livsvm的文件,提取特征用什么方法好?

  • spark mlllib? at 2018-01-30 17:06:00

    这个方法我感觉有问题, 在 github上的; 最后的 F-Measure 的面积 = 1, 不可能等于1啊, 只能无限的接近1才是对的,不知道怎么验证

  • webservice 取值问题 at 2018-01-20 00:23:19

    只要read,就会 stream closed, 这个也是没辙了

  • webservice 取值问题 at 2018-01-20 00:13:30

    这个也是 取不到值:

  • spark lambda 表达式传参 at 2017-12-27 17:40:58

    不用了, 解决了

  • spark2.2 查询 MySQL 的问题 at 2017-12-26 17:10:39

    这个有点坑爹, 这个url不能当参数带进去,传参就不输出数据; 必须写死在里面??

  • spark2.2 查询 MySQL 的问题 at 2017-12-26 16:44:12

    是的啊,多谢!!多谢!!

  • spark2.2 查询 MySQL 的问题 at 2017-12-26 15:20:33

    这是1.6的写法, 2.2.0已经弃用了 SQLContext。2.2.0官网是 spark.read().format("jdbc")....;